D 


Journal of 
Experimental Education 


gL 


A. S. BARR 


A QUARTERLY 
Edited by 


With the Cooperation of 


ARTHUR T. JERSILD 
PALMER Ο. JOHNSON 
H. H. REMMERS 
J. WAYNE WRIGHTSTONE 


VOLUME ΧΧΙ ΧΙ, 
September, 1952-June, 1953 


DEMBAR PUBLICATIONS, INC. 
303 East Wilson Street 
Madison 3, Wisconsin 


2? 


TABLE OF CONTENTS 
Volume XXI 


September, 1952 
Relationships Between Certain Attitudes Towards Teaching and 


Teaching Success: Thomas Alexander RingneSS............0..ececcececvececcecccesecece 


Practice Teaching Success in Relation to Other Measures of 
Teaching Ability: Jacob.Olày αιμα“ ο tene dtr ig Ead Sedit REA Es 


December, 1952 


A Comparative- Predictive Study of Students in the Four Curricula 
of a Teacher Education Institution: Arnold Juel Ιἠ6π..................................... 


March, 1953 


Multivariate Statistical Analysis of Differences Between Preprofessional 
Groups of College Students: Clifford M. Christensen ...... bm παρα MR SF EXIIT EIE 


Some Applications of the Method of Pivotal Condensation in Statistical : : 
Analysis: Raymond O, Collier, Ιτ...........-...-...-... πο νε ο TERT πο πας 


Test Instructions and Scoring Method in True-False Tests: 
Evan R. Keislar............. ο ο ο ος ο iR LA ο ο. E 


An Historical vs. Contemporary Problem Solving Use of the College 
Physical Science Laboratory Period for General Education: 
ο ΙΙ ο ο ο 5 Edw R ου 


The Statistical Interpretation of Degrees of Freedom: William J. Moonan..................... 


A Simplified X? Formula for Rapid Computation of Certain Item-Analysis 
Data with IBM Punched-Card Equipment: John Caffrey and Fred Wheeler.................., 


June, 1953 


An Investigation of the Relationship Between Teaching Effectiveness and the 
Teacher's Attitude of Acceptance: Harold J. Reed........-. 6.0... 


iii 


82 


221 


233 


243 


271 


Measuring Knowledge and Application: An Experimental Investi- 


gation: Donald E. Smith and Marvin D. Glock. ........ e nn 327 
An Inverted Factor Analysis Study of Student-Rated Introductory 

Psychology Instructors: A. W. Bendig........ cent 333 
Judgments by 820 College Executives of Traits Desirable in 337 


Lower-Division College Teachers: M. R. Trabue..........se n 


iv 


TABLE OF CONTENTS 
Volume XXII 


September, 1953 


Performance in a Verbal Addition Task Related to Pre-Experimental 


‘Set’ and Verbal Noise: E. Victor Mech ...... eee ttm] 


Intergroup Attitudes and Experimental Change: Margaret L. Hayes and Mary 


Püzabeth(Conklin ΜΥ ο ος rre E λα. 


The Relationship Between the Social Structure of the Classroom and the 


Academic Success of the Pupils: Margaret M. Buswell....... enn nnm 


Relations Among Factors of Raw, Deviation, and Double-Centered 


Score Matrices: Chester W. Harris... ... eee 


Factors Related to the Extent of Mortality Among Home Economics 
Students in Certain Colleges of Minnesota, Wisconsin, and Iowa 


During 1943-50: Helen Y. Nelson..... «eee nnne 


December, 1953 


A Study of Certain Effects of Three Types of Learning Experiences in Art 


as Revealed in the Drawings by Participants: Gifford C. Loomer......+.++ 


A Study of Fourth Grade Children’s Comprehension of Certain Verbal 


Abstractions: Mary C. Serra....ccecsescscccescscecscsscrerecssecseres 


A Statistical Analysis of Certain Educational Viewpoints Held by Teachers: 


Mra G. DUAE sos ο τ nn ins Ea ο αμ RE CERES 
Curriculum for Primary Teachers: Sina M. Mott... isses t t tt 


A Simple Course Evaluation Scale: Ralph Mason Dreger....... een Án 0 nnn 


“Resistance to Extinction” of Two Patterns of Verbal Reinforcement: 


E. Victor MéCh...ssseksxpakeseezssqesusueseksooeessnsc ans pom BIA IER IR Na scere inae mE 


March, 1954 


Measurement of Writing Ability at the College-Entrance Level: Objective vs. 


Subjective Testing Techniques: Edith M. Huddleston.......... eee nn nnt 


An Experimental Evaluation of the Efficicay of Two Methods of Teaching 


Music Appreciation: Morton J. Keston ......seeeeeeeee ere eee nnne 


Functional Competence in Mathematics: G. Don Alkire 


An Application of the Ferguson Method of Computing Item Conformity 


and Person Conformity: H. M. Fowler........eeenen Hn 


Tables for Transmutation of Orders of Merit into Units of Amount or 


Scores: 


Kenneth E. Anderson, Robert T. Gray, Einar V. Κα]είεαϊ.................. 


19 


37 


53 


59 , 


65 


103 


119 
133 
145 


155 


$ 


A Empirical Investigation of the Problem of Disproportionate Frequencies 
in Analysis of Covariance as Applied to a Methods Experiment: 


Daisy Starkey Edwards and Sidney J. Parkin ο... 257 
Estimating Components of Variation in an Experimental Study of Learning: 
William Harrison ΙΙΟΟΝ........-.-»».-..:««”λλ:»». — σι νο ο Tete 265 
A Prócedure for Analyzing a Test and Maximizing its Reliability: 
Angus G. MacLean and Arthur T. Tait.......-+-++++ TET aa EH e T 273 
The Least-Squares Analysis of apXqXr Factorial Design with Unequal 
Subclass FrequenCieS...... eee eene hehe 279 
On the Problem of Sample Size or Multivariate Simple Random Sampling: 
William J. Moonan...... nn m IIR Crore nieces moins se aa Saat E EAR ος. 285 
Note on Dispersion Analysis: Chester Harris....... enn nn ος ο ος, 289 
Constancy of Rorschach Color Responses Under Educational Conditioning: 
Janet E. Blechner....... «enn ns METTI LE we ον ο τος s 293 
` “Psychological’’ Correction for Chance: Julian C. Stanley........- vesdib3ska v erspva " 297 
June, 1954 
Some Effects of Promotion and Nonpromotion Upon the Social and Personal T 
| Adjustment of Children: John I. Goodlad....... MEI T—Ó 
A Substrata Analysis of Spelling Ability for Elements of Auditory Images: - 
| Jack A..Holmes..... n —— ME ον ο ο. 
Annotated Bibliography of Publications Related to Teacher Evaluation: 
William A. WatterS..... sent IC —— — ο hei ers 5 351 
| Inteliigence Levels and Corresponding Interest Area Choices of Ninth Grade 
Pupils in Thirteen Michigan Schools: Kent W. Leach .......- paseg ene canes I 369 
Outcomes of Lecture and Discussion Procedures in Three College Courses: - 


P . Harry Ruja ....- BM 


Journal of Experimental Education 


June, 1953 


Number 4 


Volume XXI 


πετ 


AN INVESTIGATION OF THE RELATIONSHIP 
BETWEEN TEACHING EFFECTIVENESS AND 
THE TEACHER'S ATTITUDE 
OF ACCEPTANCE - 


HAROLD J. REED* 
Long Beach, California 


I. Purpose 


THE PURPOSE of this study is to investi- 
gate the relationship between the teacher's atti- 
tude of acceptance and his teaching effectiveness. 
This investigation will examine the hypothesis 
that the teacher who is the more accepting of him- 
Self and his environment is the more effective 
teacher. 


IL Problems 


The broad scope of this study can be defined 
in terms of the following problems: 

1. Is there a determinable relationship b e - 
tween the teacher's effectiveness in theclassroom 
and that aspect of a teacher's personality organi- 
zation, or attitude, which permits him to be an 
accepting person? . 

2. Can the predictive instrument employed in 
the present investigation be made to provide mean- 
ingful information concerning the teacher's atti- 
tude of acceptance? : 

3. Can the criterion measures employed in 
the present investigation be made to provide re- 
liable information concerning the students' eval- 
uations of their teachers’ effectiveness and the. 
Students: feelings concerning their teachers’ atti- 
tudes toward them? . 

4. Can the criterion measures employed in 
the present investigation be made to show mean- 
Ingful relationships between the different criter- 
10η groups? 

_ 9. Can it be shown that self-evaluations of teach- 
Ing effectiveness are reliable criterion measures? 
6. Are there any meaningful relationships he- 
66η certain biographical data and the teachers 

attitude of acceptance? 


IH. Limitations of Stud 


The scope of this study can be defined further 
in terms of its limitations. 

1. This investigator has chosen to approach 
the complex problem of teacher effectiveness in 
terms of the personality dynamics of the teacher. 
All other factors are thereby excluded. 

2. It is not the purpose of this investigator to 
determine what type of personality is most effec- 
tive as a classroom teacher, nor to determine 
the structure of the optimum personality. Rather 
is it his purpose to analyze one aspect, or dimen- 
sion, of the optimum personality organization 
which has been found to be significant in other as- 
pects of the study of human nature, and to deter- 
mine its relationship to effective teacher behav- 
ior. 

3. This study will not attempt to compare sec- 
ondary teachers on this dimension of acceptance 
with other teachers nor will it try to com pare 
teachers with other occupational groups. 

4. The criterion measures used in this inves- 
tigation are designed primarily to sample the 
feelings of the students toward their teachers. 

It is not the purpose of this investigator to define 
the effective teacher nor to establish valid criter- 
ion measures of teacher effectiveness, beyond 
what is involved in the criterion measures used 
in this investigation. 


IV. Need for Study 


It may seem presumptuous of this investigator 
to think that Something more can be added ie the 
many studies which have been made in the area 
of teaching efficiency. He may be equally bold 
in his latent criticism of the assumptions and 


* 094 Orlena Avenue 


278 JOURNAL OF EXPERIMENTAL EDUCATION 


methodology of other experimenters. On the other 
hand, the simple fact that as yet there is no evi- 
dence of unanimity of opinion regarding the nature 
of the effective teacher—and therefore no accept- 
able technique for selecting the good teacher — 
would seem to justify the assumption that a new 
hypothesis or a new predictive measure might 
add some knowledge or clarify some disputed 
factor. 

In referring to correlation studies between in- 
telligence and effective teaching, Super concluded 
‘‘Apparently the occupation ‘teacher’ is too broad 
a category for psychological study. "1 He also 
indicated that some occupational groups ‘‘ were 
not distinguishable from men-in-general’’2 in 
their interest patterns. Teaching was consider- 
ed by Super to be one of those groups. This at- 
titude is shared by many investigators, butothers 
believe that the search for more refined meas- 
ures and descriptions of the effective teacher 
should continue. 

After reviewing 150 studies in the measure- 
ment and prediction of teaching efficiency, Barr3 
observed that the predictive devices used would 
indicate that improvements could be made. He 
called attention to the fact that in most studies 
the reliability was high but the validity was un - 
known. The low correlations of validity may have 
been due, according to Barr, to a weakness in 
the criterion or predictive measures. He also 
felt that measures had been confused with evalu- 
ations, and data had been consistent only when 
repeated under comparable conditions. 

The literature is replete with supplications 
for better integrated teachers. Baxter has stated 
that *'the classroom must be considered a social 
laboratory in which children learn to live with 
others cooperatively. 4 Snygg and Combs have 
called attention to the role played by the teacher 
in assisting students to discover realistic and 
effective solutions to their present problems. The 
complex living conditions of today seem to be 
charging the American schools and the teachers 
within those schools with continually broadening 
responsibilities. These increased responsibil- 
ities of the school seem to demand teachers of 
peculiar powers and abilities. 


? 


(Vol. XXI 


The importance of the teacher’s role in the 
educational process has never been questioned. 

If an administrator were to employ only those 
candidates who possessed all of the traits of the 
desired teacher, his school would be very under- 
staffed. At the same time the administrator is 

in great need of predictive cues which will enable 
him to select candidates who most closely approach 
the ideal. Any knowledge which can assist him 
will improve the school's contribution to the stud- 
ent and society. 

The selective process is a continuous one, ac- 
cording to Ryans, ‘‘beginning as early as possible 
in student life and continuing through teacher train- 
ing and on into the employment period. "6 At each 
succeeding level the refining process should be 
more discriminating. 

Not only is there a continuous process of selec- 
tion, but there is a need for a continuous or con- 
sistent criterion of judgment. Once the policy is 
accepted, school counselors, training institutions, 
and administrators are better able to perform 
their functions. 

Growth is also a continuous process. As our 
fund of knowledge of the teaching process in- 
creases through experience and experimental ev- 
idence, the findings should be shared with active 
teachers. 

Any evidence of the importance of the teacher's 
attitude toward himself, toward others, and to- 
ward situations, as it relates to his teaching ef- 
ficiency, should be investigated. Teachers often 
have deep-rooted habits that get in the way of 
achieving their goals. ‘‘For example, the habit 
of judging pupil behavior in terms of its effects 
on the accomplishment of teacher’s own purposes 
for the child, or for the group, interferes with 
understanding the child; so does the habit of judg- 
ing pupil behavior on the basis of the teacher’s 
personal prejudices and cultural values. ''7 

The great success of the volunteer study groups 
sponsored by the American Council on Education 
attests to the effect of in-service training pro- 
grams for increasing teacher effectiveness. It is 
hoped that the results of this study may, as others 
have done, provide some insight to teachers 
through study groups. 


l. D. A. Super, Appraising Vocational Fitness (New York: Harper and Brothers, 1949), p. 101. 


2. Ibid., p. 383. 


36 A. Se Barr, "The Measurement and Prediction of Teaching Efficiency," Journal of Experimental 


Education, (June 1918), p. 20. 


lh. B. Baxter, Teacher-Pupil Relationships (New York: Macmillan Co., 19l2), p. 2. 
5, D. Snygg and A. W. Combs, Individual Behavior (New York: Harper and Brothers, 199), p. 2h2. 


6. D. Ge Ryans, "Appraising Teacher Personnel," Journal of erimental Education, XVI (Septem- 
ber 1947), 5: 1. 


Το Division on Child Development and Teacher Personnel, Daniel A, Prescott, 


Helping Teachers Ur - 


June, 1953) ; 


A theoretical framework has been necessary 
for educational research. Educators are respon- 
Sible for evaluating the effects of various educa- 
tional procedures. Philosophy, and more partic- 
ularly the philosophy of education, has attempted 
to make our experience intelligible through crit- 
ical thinking, 8 Education has also looked to psy- 
chology for a theoretical frame of reference as 
wellas for facts. However, psychologists, ac- 
cording to Snygg and Combs, have not developed 
“a frame of reference which brings their unwieldy 
body of information into unity and consistency.’’9 
Until some purposeful and meaningful order is 
created from this atomistic approach, neither ed- 
ucation nor psychology can proceed toward effec- 
tive solutions to the problems of education. 

This investigator will attempt to keepin mind 
the needs of education in the area of teacher ef- 
fectiveness and at the same time utilize a predic- 
tive device of psychology with a frame of refer- 


. ence which seems to be consistent with the phil- 


osophy of both education and psychology. 

Most studies in this area have standardizeda 
predictive device against standardized criterion 
measures, or they have attempted to correlate 
standardized tests against a sample of active or 
apprentice teachers. Few attempts have been 
made to standardize teacher norms on these tests. 
That, it would seem, would be a meaningful con- 
tribution. In this study, the investigator will at- 
tempt to create his own criterion measures and 
establish his own norms on a predictive device 
of a type which has been found to be effective in 
related areas. 

This investigator believes that, in many stud- 


ies, implicit errors result from the use of cri- 


terion and predictive measures which are not ap- 
plicable to the purposes of the investigator, and 
that these errors have contributed to rather than 
resolved the confusion in the measurements of 
teacher efficiency. For purposes of this study, 
the investigator will sample the feelings and the 
judgment of students, administrators, and teach- 
ers on an unstructured scale, and thereby attempt 
to free himself of the influence of norms which 
may be inappropriate. The investigator willalso 
attempt to sample a dimension of personality 
dynamics which has not previously been meas- 
ured, and to determine its relationship to teach- 
er effectiveness. It is hoped that a break with 


REED 279 


conventional design may produce some new un- 
derstanding of this all-important area in the ed- 
ucational field. 


SECTION II 
RATIONALE FOR STUDY 


THE RATIONALE for this study is to be 
found in the thinking of those psychologists and 
educators committed to the opinion that behavior 
is a function of a well-defined and consistent at- 
tempt on the part of the organism to maintain a 
unified and integrated personality organization. 
As has been previously stated, it is not the pur- 
pose of this investigator to add any new thinking 
to the theory of personality. Rather is it his 
purpose to submit some of the concepts of prev- 
ious researchers to further investigation for pur- 
poses of clarification and verification. More 
specifically, it is proposed to determine the ex- 
tent to which their hypotheses can be helpful in 
teacher selection and training. 

It has been assumed from a review of the lit- 
erature and experience in the fields of philosophy 
of education and clinical psychology that ther e 
is an optimum quality of personality thatis more 
effective than another, and that this quality of 
personality is based upon the individual's goal of 
maintaining some structure of values that is 
meaningful to him and acceptable to his society. 


The basic value concepts are the mean- 
ingful core about which the personality 
is organized. An integrated personal- 
ity will emerge if the core values are 
harmonious and valid, while mental con- 
flict will occur if the core concepts are 
inharmonious or incompatible. 1 


These core values are not readily perceptible, 
and for identification of them one must rely upon 
their manifestations in behavior patterns, traits, 
attitudes, feelings, etc. One trait that has been 
examined carefully is acceptance. It has been 
assumed that the individual who is accepting is 
unthreatened; he feels secure; and if he is un- 
threatened he will have no need to be aggressively 
hostile or to defend himself. Dynamically, it 


derstand Children 


Council on Education, 1945), p. 21. 


prepared for Commission on Teacher Education (Washington, D. C.: American 


8. J. T. Wahlquist, The Philosophy of American Education (New York: The Ronald Press, 1942), p. 5. 


9. Snygg and Combs, op. cites p.205. 


Section II----- 


l. D. H. Prescott, Emotion and the Educative Process (Washington, D.C.: American Council on Edu- 


cation, 1938), p. 287. 


280 JOURNAL OF EXPERIMENTAL EDUCATION 


can be said that the unthreatened, secure, or ac- 
cepting personality is one that is well integrated 
or harmoniously balanced. 

Kurt Goldstein's2 and W. B. Cannon's? con- 
cepts of:equilibrium and homeostasis describe 
the organism's attempt to maintain itself in the 
face of physiological disturbances. The same 
phenomena can be noted in the organism’s affec- 
tive nature. ‘‘Behavior expresses the effort to 
maintain the integrity and unity of the organiza- 
tion.... The nucleus of the system, around which 
the rest of the system revolves, is the individual's 
idea or conception of himself, ''4 

The drive within the individual to maintain a 
balance, or his integrity, has been variously de- 
fined, but always in terms of the individual's 
needs and his perception of the world around 
him. Sherman and Sherman? concluded from 
their study of emotional responses in infants that 
there were two opposite tendencies, ‘‘rejecting 
the stimulus and accepting the stimulus." It 
would seem, therefore, that the self is the locus 
of behavior. What the self does as he reacts to 
his environment is determined by the way he per- 
ceives his world, whether it is acceptable or 
whether it is threatening. 

From our knowledge of the phenomena of per- 
ception, we can see that our perceptions do not 
come simply from the objects around us, but 
from our past experience. Ü These perceptions 
Seem to be screened through our past experience 
with those objects, the affective conditioning re- 
sulting from that experience, and the future goals 
we have in mind for ourselves. As Kelly points 
out, we take a large number of clues, none of 
Which is reliable, add them together, and make 
what we can of them. All that this gives us isan 
estimate of our surroundings." The Hanover In- 
Stitute demonstrations in perception showed that 
distortion of perception was Significant, butthey 
also showed the disturbing effect that distortion 


(Vol. XXI 


had on the one viewing it. Some became angry, 
some laughed, and some were embarrassed. When 
old habits fail to satisfy, the inconsistency pre- 
sents a problem for the individual. Snygg and 
Combs have said, 


Those individuals whose perceptions 
made possible the satisfaction of need 
are happy, effective and efficient people. 
On the other hand, those whose differen- 
tiations do not permit of adequate need 
satisfactions are likely to be ineffective, 
unhappy and generally thwarted person- 
alities. 


The differentiation of the perceptual world is 
in terms of those things that are consistent with 
the individual's idea or conception of himself, ac- 
cording to Lecky, 9 or the individual's self-inter- 
ests and value concepts, as stated by Prescott, 
or in terms of the phenomenal self, as described 
by Snygg and Combs. The latter writers state 
that the basic human need is ‘‘the preservation 
and enhancement of the phenomenal self" and 
“the phenomenal self includes all those parts of 
the phenomenal field which the individual exper- 
iences as part or characteristic of himself. 11 
Raimy was one of the first to work in this area 
and he defines the self-structure as the self-con- 
cept which ‘‘is the more or less organized per- 
ceptual object resulting from present and past 
self-observation, 313 

The dynamic interrelations between the situ- 
ations of our phenomenal field and the desire of 
the individual to maintain a state of balance in 
accordance with his concept of himself, his phen- 
omenal self, give rise to emotional behavior. 
**Attitudes and value concepts define for us the 
areas of experience which will carry the possi- 
bilities of arousing emotional responses. "13 For 
those who are well adjusted and emotionally ma- 


2. K. Goldstein, Human Nature in the light of Psychopathology (Cambridge: Harvard University 
Press, 1940). 


3. W. B. Cannon, The Wisdom of the Body (New York: W. W. Norton & Co., Ince, 1939). 


he P. Lecky, Self-Consistency (New York: Inland Press, 1945), p. 150. 


5. M. and Το C. Sherman, "Senso 
ogy» V (1925), pp. 53-68. 


ry-Motor Responses in Infants," Journal of Comparative Psychol- 


6. E. Co Kelly, Education for ihat is Real (New York: Harper and Brothers, 19h7), p. 3h 


Το Ibid., p. 3h. 


8. D. Snygg and A. We Combs, Individual Behavior (New York: Harper and Brothers, 1919), pp. 113- 


115. 
9. Lecky, op. Cites pe 150. 


10, Prescott, OpeCites p. 89. . 


June, 1953) ô 


ture, it can be said that they are capable of ac- 
cepting into their organization any and all aspects 
of reality. If too many elements in the phenom- 
enal field are unacceptable to the self, they will 
be rejected. If test results are consistent with 
the individual's already differentiated concept of 
himself, there is likely to be little difficulty of 
acceptance. When they deviate, there is a prob- 
lem. Levine and Murphy 14 found that pro-Com- 
munist sympathizers were not only able to mem- 
orize pro-Communist materials more readily 
than anti-Communist literature, but their recall 
was better. The opposite was true of the anti- 
Communist group. 


phenomenal environment 
phenomenal self 


self-concept 


Snygg and Combs have illustrated in the above 
diagram how the environment and the self-concept 
are related to each other. ‘‘The closer to the 
center of this figure an enhancing or threatening 
differentiation occurs, the more vividly it willbe 
experienced. "15 ‘‘The closer a deviant percep- 
tion lies to that portion of the phenomenal self 
which we have called the self-concept, the more 
difficult change is likely to be. "16 

The organization of the self-concept and its 
functional aspects can be understood through a 
Study of attitudes. Attitudes have been shown to 
Play an important role. They partially definethe 
areas of emotionality, or those areas which the 
individual finds difficult to incorporate into his 
phenomenal self, and which hence cause frustra- 
tion and aggression. 

Attitudes are formed through this constant in- 
teracting process. Those that are acceptable, 
Or have value for the individual, are retained and 


ll. Snygg and Combs, op. cit., p. 50 


12. V. C. 
XII (1948), p. 153. 


13. Prescott, op. cit., p. 89. 


REED 281 


become habits; those that have negative valueare 
rejected. Allport lists four ways that attitudes 
are formed:17 (1) through the accretion of ex- 
perience, or the integration of numerous specif- 
ic responses of a similar type; (2) by individua- 
tion or differentiation; (3) through dramatic ex- 
perience or trauma; and (4) through the imitation 
of parents, teachers, or playmates they are some- 
times adopted readymade. 

Allport further defines attitudes 18 as well- 
defined objects of reference, (1) either material 
or conceptual, (2) either specific or general, (3) 
signifying an acceptance or rejection of the ob- 
ject or concept of value to which they are related. 
They lead one to approach or withdraw, to affirm 
or to negate. To those who believe in the unitary 
approach, there is but one basic attitude, pur- 
pose, or motive, and that is a constant striving 
for unity. The emotional states resulting from 
a disruption of this unity cannot be treated inde- 
pendently. 19 Love is the emotion subjectively 
experienced in reference to a person or object 
already assimilated. Grief is experienced when 
the personality must be reorganized due to the 
loss of one of its supports. Hatred is an impulse 
of rejection felt towards unassimiable objects. 
Experiences which increase the sense of psycho- 
logical unity, or well-being, give rise to the emo- 
tion of joy. Prescott has expressed the same 
thought 20 that attitudes determine the meanings 
of situations. Conditions menacing our immed- 
iate safety arouse fear, so happenings which 
jeopardize the attainment of security in the fu- 
ture give rise to anxiety. 

The extreme opposition to this unitary 
approach has been recently stated by Thorndike.21 
He would postulate a hierarchial organization of 
selves. Traits, such as honesty, are not unitary 
to Thorndike, but rather are collections of indes 
pendent features. He feels that even the factor 
studies of Cattell and Guilford are unproductive. 
Emotional states are due to the proclivities of 
gene determiners, and personality can be ex - 
plained or modified through the action and modi- 
fication of the stock through eugenics. He warns 
of too strict an interpretation of behavior by the 


, "Self Reference in Counseling Interviews," Journal of Consulting Psychology, 


lh. J. M. Levine and G. Murphy, "The Learning and Forgetting of Controversial Material," Journal 


of Abnormal and Social Psychology, LVIII (1945), pp. 507-517. 


15. Snygg and Combs, op. cit., p. 129. 
16. Snygg and Combs, op. cite, p. 157. 


17 G. W. Allport, "Attitudes," Handbook of Social Psychology, Carl Murchis i TC 
on, Edit i 
Masse: Clark University Press, 1935), PP. SED EL 3 or (Worcester, 


282 JOURNAL OF EXPERIMENTAL EDUCATION 


holists, connectionists, or purposivists. 

Regardless of the manner by which one at- 
tempts to explain behavior, there are certain 
points of agreement. All would (1) agree that 
behavior is causal, (2) postulate some concept of 
optimum adjustment, (3) say that there are some 
situations which have a positive effect upon the 
individual and other which are negative, and (4) 
that any change in the modus operandi is extreme- 
ly difficult. 

Whether personality is approached from the 
reference point of explanation or modification, 
one concept seems to emerge as all-important, 
namely, acceptance. Adjustment can be meas- 
ured on a dimension of self-approval and self- 
disapproval, acceptanceor rejection. Raimy 
postulated in his study, previously referred to, 
that the approval, disapproval, or ambivalence 
one feels for the self-concept, or some of its 
sub-systems, is related to his personal adjust- 
ment. 

This conceptual framework for the regarding 
of personality has been well summarized by Carl 
R. Rogers22 following his clinical experience 
and an analysis of accumulating research evidence. 
His theory of personality and behavior has been 
stated in the form of nineteen propositions; ‘‘some 
of these propositions must be regarded as assump- 
tions, while the majority may be regarded as hy- 
potheses subject to proof or disproof, ”’ 

The studies reviewed in Chapter II (not repro- 
duced in this report; see original thesis on file 
in Library, University of Southern California), 
under Section 5, ‘‘Related Studies in the Field of 
Psychotherapy, ” concluded that increasing ac- 
ceptance of self and the assumption of responsi- 
bility for the self constituted positive changes in 
the personality organization, Snygg and Combs 
describe this sequence or development of insight 
as follows:23 (1) Individual perception of a dif- 
ference existing between the demands of the sit- 
uation and his phenomenal self. (2) Acceptance, 


€—— "HUND" 


18. G. W. Allport 


(Vol. XXI 


or the inclusion of a new concept into the phen- 
omenal self by means of a new differentiation of 
self. They point out that changes may occur grad- 
ually, traumatically, or in sheltered groups. 

It is proposed in this dissertation to submit 
this hypothesis of acceptance as a measure of 
adjustment and effectiveness to experimental 
proof. If the well-adjusted person is an accept- 
ing person, it should follow that an effective tea- 
cher is an accepting teacher. The teacher who 
is unthreatened should be accepting and accept- 
able. There should be a minimum of defensive 
behavior on the part of the accepting teacher. It 
is assumed that, if the teacher is accepting and 
rejecting certain elements in his phenomenolog- 
ical field, the student is doing likewise. If there 
is a conflict between the needs of the two, there 
will be problems. Changes must take place in 
the behavior of one or the other, or both, if har- 
monious adjustment is to be effected. 

It is further assumed that, if a conflict exists, 
one must resolve the problem before the other. 
That one should be the teacher. It is not impos- 
Sible for the student to accomplish insight and 
corrective action, but if our concept of the tea- 
cher's role is valid, it is the teacher's respon- 
sibility to provide the atmosphere in which the 
student may adequately **evolve" and grow. 

The unthreatened or accepting teacher is one 
who can best accomplish the general function of 
education as stated by Dewey. '*Of these three 
words, direction, control, and guidance, the 
last best conveys the idea of assisting through 
cooperation the natural tendencies of the individ- 
uals guided. "24 Kirkpatrick29 expressed the 
same thought in his challenge to teachers to al- 
low the students to learn to think for themselves. 
Democracy is lived, not learned. 

Prescott, after several years of conducting 
teacher study groups on understanding children, 


concluded: ‘‘Whatever may be the root from which 


develops an emotional acceptance of all young- 


Personality, A Psychologii i : 
1937), pen 5l, L ological Interpretation (New York: Henry Holt and Cosy 


19. P. Lecky, op. cite, p. 152. 


20. Prescott, op. cite, p. 190. 


21. E. Le Thorndike, "The or 
XLV (1950), pp. 137-145. 


22. C. Re Rogers, Client-Centered Thera 


ganization of a Person," Journa 


l of Abnormal and Social Psychology, 


py (Boston: Houghton Mifflin Co., 1951), Ch. 11. 


23. Snygg and Combs, op. Cite, p. 9h. 


2l. John Dewey, Democracy and Education (New York: 


pp. 83-90. 


Macmillan Co., 1916), p. 28. 
25. We He Kirkpatrick, "Democracy and Respect for Personality 
3 


" Progressive Education, XVI (1939), 


June, 1953) + 


sters, we have found that this attitude character- 
izes the teachers who are most effective in their 
work, "26 

What has been found to be effective in therapy, 
a learning process, may also be effective in the 
learning process called education, according to 
Rogers.27 And Snygg and Combs 28 felt that the 
student with a tremendous drive toward growth 
and self- enhancement required only practicable 
and socially acceptable opportunities for growth 
and development. 

The effect on the child of the interacting be- 
havior between the teacher and the child is ob- 
vious, according to Wickman. 29 «By counter- 
attacking the attacking types of problems and by 
indulging the withdrawing types, the underlying 
difficulties of adjustment in each case are in- 
creased and the undesirable expressions of so- 
cial behavior are further entrenched. ’’ It would 
therefore seem necessary, if teachers are to 
provide a good learning situation, that they un- 
derstand the student’s behavior and be able to 
accept it. They must appreciate that the student 
lives in a different perceptual world and that that 
world is amenable to change. Good behavior 
and good grades may be the goal of the elemen- 
tary child, but they may be a disgrace to the high 
school student. The teacher who can understand 
and accept the student’s concept of himself has 
already contributed much to the learner’s learn- 
ing by providing an accepting environment in 
which the learner feels worthwhile and in which 
he will therefore be more eager to assume the 
responsibility for his own learning. 


SECTION ΠΠ 
SOURCE OF DATA AND METHOD 


THE PURPOSE and need for this study 
as contained in Section I of this report, and Chap- 
ter II (not included in this report; see original 
thesis on file in Library, University of Southern 
California), indicated that some variation in de- 
Sign might possibly produce some improvement 
in the understanding of the effective teac her. 
The literature and the experience of the investi- 
gator and colleagues have offered a possible di- 
mension of the personality organization which, 


REED 283 


if present, conceivably could be contributing to 
all effectiveness of school teachers. Section II 
in this report contained a description of an atti- 
tude of acceptance as this dimension of person- 
ality which this investigator has sought to exam- 
ine for its relationship to teacher effectiveness. 

The data for this study were collected from 
the following sources: 

1. The criterion measures consisted of eval- 
uations of the teachers' effectiveness and relat- 
ed aspects of teacher effectiveness on three 
scales, Scales A, B, and C. Evaluations of 160 
teachers were made by the students and admin- 
istrators at three secondary schools. Of the 
160 evaluated teachers, 104 volunteered to par- 
ticipate in the predictive phase of this study. 
These teachers were asked to evaluate th e m- 
selves on the same three scales used by the ad- 
ministrators and students. It was therefore pos- 
sible to compare the difference between student 
and administrator ratings of those teachers who 
volunteered to participate in this study and those 
who did not. The 104 teachers who did volunteer 
will hereafter be called participating teachers 
and the 56 who did not will be referred to as non- 
participating teachers. 

2. The predictive measure of teacher effec- 
tiveness consisted of a sentence completion test 
from which a quantitative measure was obtained 
of the teachers' attitude of acceptance. Onehun- 
dred and four teachers who participated in th e 
standardization and validation program complet- 
ed the sentence completion test divided into two 
parts on the basis of scoring principles. 

This Section will present a summary of the 
source of data used in this investigation. Section 
IV will present a detailed description and justi- 
fication for the criterion measures used. Sec- 
tion V will offer a rationale and use for the sen- * 
tence completion technique. Section VI will pre- 
sent the scoring principles used in this investi- 
gation and the method of establishing reliability 
for the test. Section VI will summarize the 
findings from the criterion measures, and Sec- 
tion VIII will present the correlations between 
the criterion measures and the predictive meas- 


ures. 
I. Schools Participating in the Study 
The secondary schools selected for this study 


26. Division on Child Development 
Stand Children (Washington, D.C.: 


27. Rogers, op. Cite, p. 38h. 
28. Snygg and Combs, op. cite, p. 238. 


9. E. Ke Wickman, C 
1928), p. 171. 


and Teacher Personnel, D. H. Prescott Helping Teachers Under- 
American Council on Education, 1945 5 Ds r 


hildren's Behavior and Teachers! Attitudes (New York: The Commonwealth Fund, 


284 JOURNAL OF EXPERIMENTAL EDUCATION i 


were chosen primarily on the basis of availabil- 
ity to the investigator. They represent, however, 
a fair cross section of the population and socio- 
economic status of a large metropolitan area in- 
cluding both urban and rural communities. 

Two of the three schools are city school dis- 
tricts and the third a union district. Two of them 
are primarily agricultural communities includ- 
ing some industry, and the third is an urban res- 


idential area contiguous to the city of Los Angeles. 


All three are large departmentalized schools of 
over one thousand students with 46, 83, and 103 
classroom teachers. 


II. Criterion Data 


Criterion measures for this study consisted 
of student, administrator, and self evaluations 
of classroom teachers on three scales. Table 
I indicates that 160 teachers were evaluated, in- 
cluding 93 male teachers and 67 female teachers. 
Of the 160 evaluated teachers, 104 participated 
in the standardization and validation of the pre- 
dictive measure. 

In Schools I and II, all of the teachers were 
evaluated, while in School III only the partici- 
pating teachers were rated. A total of 10,115 
student evaluations were obtained on the 160tea- 
Chers, with a mean number of 79. 8 for School I, 
60. 16 for School II, and 47.2 for School III. 

It will be noted from the table that the number 
of 10th grade students of School III is out of pro- 
portion both to the number of 11th and 12thgrade 
Students and as compared with schools I and II. 
This can be accounted for by the fact that the vol- 
unteer participating teachers in that School had 
more 10th grade than 11th and 12th grade classes. 
At Schools I and II, the number of classes visited 
were equally distributed among the three grades. 

The investigator and two assistants selected 
a sufficient number of classes of required 
courses in Schools I and I to obtain a sampling 
ofallteachers. In order to disrupt the classes 
as little as possible, the students in each class 
evaluated each of their several teachers at one 
time. AM evaluations for one school were ob - 
tained in one day, The average length of time 


for administration was fifteen minutes for all 
three scales. 


IH. Evaluation Scales 


Several problems presented themselves in 
gathering the criterion data, (1) Some evaluation 


(Vol. XXI 


of the teacher's efficiency was desirable, as well 
as an indication of the students' attitudes and 
feelings toward their teachers' behavior. There- 
fore, at least two scales were essential, one for 
effectiveness and one or more to register the 
students' feelings on other factors. 

It was felt that current rating scales were too 
structured for purposes of this study. Anun- 
structured, or non-itemized, scale was prepared 
for the effectiveness evaluation. (See Appendix 
A of original thesis on file in Library, University 
of Southern California.) For Scales B and C, the 
situation was structured or itemized to the degree 
that the two extremes were defined for the evalu- 
ator. 

(2) The reliability of rating scales is depend- 
ent upon the clarity and consistency of the instruc- 
tions. Only the investigator and his two assist- 
ants administered the evaluations in 144 class- 
rooms. The instructions to the students were 
standardized and presented uniformly. (See Ap- 
pendix B of original thesis on file in Library of 
the University of Southern California. )1 The stud- 
ents had no questions on Scale A. There were 
very few on Scale B. The students seemed to 
understand the difference between hypothetical 
Teachers X and Y. However, there was some 
question regarding the extent to which a certain 
teacher had the characteristics of X or Y. Some 
students felt that a certain teacher had some of 
each. In that case, the examiner had to repeat 
the instructions that the check mark was to be 
placed nearer X than Y if the student felt that the 
teacher had more of the X traits than Y traits, 
or if the teacher had all of the X traits in excess 
of the Y, but he had them to a lesser degree than 
the ideal established in the definition. After com- 
pleting the B Scale, the students had no questions 
on the C Scale, due to its similarity to the B 
Scale. 

(3) A third problem was the order of presen- 
tation of the three scales. It was desired to re- 
duce as much as possible the influence of a re- 
Sponse set, which is normally present whenever 
one evaluates another person on more than one 
item or scale.2 It was decided that the general 
teaching effectiveness scale, Scale A, if present- 
ed first, would reduce the influence of a response 
Set toa minimum. In the first place, Scale A 
was unstructured or non-itemized. The evalu- 
ator could set his own limits and use his own 
Standards. If Scales B and C were completed 
first, some carry-over from those structured 
scales would occur. In the Second place, it was 


Section III----- 


1. The only variation in the instructions was 
were evaluated. 


at School III, where only the participating teachers 


2. L. Se Cronbach, "Response Sets and Test Validity," 


Educational and P i M 
VI (19h6), pp. hT5-h9h. Syehological Measurement, 


June, 1953) 


ο 


REED 


TABLE I 


SOURCE OF CRITERION DATA 


28 


Schools 
Evaluations ï n ΤΠ Total 
Teachers Evaluated 46 82 32 160 
Male 26 51 16 93 
Female 20 31 16 67 
Teachers Participating 21 51 32 104 
Male 13 31 16 60 
Female 8 20 16 44 
Percent of Evaluated Teachers 
Participating 45.7 62.2 100 
Male 50. 61. 50. 
Female 40. 64.5 50. 
Administrator Evaluations 3 2 1 
Self Evaluations 15 30 28 73 
Student Evaluations 
10th Grade Male 97 132 438 667 
10th Grade Female 88 172 479 739 
11th Grade Male 155 149 169 473 
11th Grade Female 156 182 157 495 
12th Grade Male 115 116 138 369 
12th Grade Female 93 123 130 346 
Total 704 874 1511 089 
Total Teacher Evaluations 3671 4933 1511 10115 
79.8 60.16 47.2 


Mean Evaluations per Teacher 


286 JOURNAL ΟΕ EXPERIMENTAL EDUCATION . 


felt that all evaluators would be more familiar 
with the nature of Scale A than of Scale B and C, 
and they would therefore be able to complete it 
more easily. 

(4) A fourth problem involved the tendency or 
set to evaluate each teacher as above average. 
It was thought that, if each evaluator first ranked 
his several teachers, the set to rate each teach- 
er about the same could be broken. Such a pro- 
cedure would force greater variation and more 
discrimination. However, an equally difficult 
problem would be likely to appear, namely, the 
lack of freedom to place the teacher where the 
evaluator wished. Forcing the student to rank 
his teachers might interfere with giving him the 
opportunity to express his unstructured feelings; 
he would be forced into a position of deciding 
which of two or more teachers was the more ef- 
fective, and the resulting frustration could cause 
him to record his evaluation in a position other 
than that which he desired. 

An attempt was made to break any set toward 
ranking all teachers above average by reversing 
the position of the optimum. Theoretically, 
“Teacher X" on Scales B and C represented the 
same level as ''superior" on Scale A. At the 
same time, it was not felt that this would inter- 
fere with the evaluator’s judgment of the teacher. 

(5) A fifth problem was closely related to the 
fourth. A decision was made to construct the 
scales with no intervals. Again it was thought 
desirable to allow the evaluator complete free- 
dom in his recording and therefore introduce as 
few distractions as possible from the primary 
function, namely, the effectiveness of the teach- 
er as felt by the student or administrator. The 
scoring of the evaluations by the investigator was 
accomplished by a nine-point scale placed on the 
line, and the interval value recorded. 

The Scales used by the administrators were 
identical to those used by the students. The in- 
structions for Scale A were also identical. How- 
ever, for Scales B and C the administrator was 
asked to evaluate the teachers on the basis of 
what he felt would be the students’ feelings about 


the teachers’ attitudes, and methods of going 
about their teaching, 


IV. Data on Predictive Measures 


The data on the predictive measures were ob- 
tained from two sources: (1) scores on a sentence 
completion test, and (2) self-evaluations by the 
participating teachers on the three scales used 
by the students and administrators. A total of 
104 teachers participated in the standardization 
and validation of the sentence completion test, 
the primary source of data. 

Of the 104 participating teachers, 60 or 58 
percent were males and 44 were females. Sev- 
enty-two or 69 percent were married and 31 un- 


(Vol. XXI 


married. (One declined to state.) The average 
age of the group was 38.5 and the average num- 
ber of years of teaching experience was 13. 6. 
Sixty, or 58 percent, of the teachers were aca- 
demic instructors. 

An original test was compiled consisting of91 
items or stimulus phrases. A majority of the 
items were taken from other published tests; 
the rest consisted of several items constructed 
by the investigator ot sample attitudes toward 
structured school situations. 

It was proposed to score the item responses 
on a uni-dimensional value of acceptance. Stand- 
ardization of the scoring system was accomplish- 
ed by correlating the test responses of the teach- 
ers at School I against the evaluations of the stud- 
ents on Scale A, teacher effectiveness. 

After 64 items were eliminated by observa- 
tion and difficulty of scoring, the tests of the 83 
teachers in Schools II and III were scored “blind”? 
on the 27 items remaining. These total scores 
were validated against the student and adminis- 
trator evaluations at Schools II and III. 

Of the 64 items rejected, it was found that 13 
of them, structured to elicit responses to various 
drives, could be scored on another dimension. 
The investigator then standardized a scoring sys- 
tem for the 13 items. Again, the validation of 
these items was accomplished by correlating the 
scores against the student and administrator eval- 
uations of the participating teachers inSchools II 
and ΠΠ. 

The chi-square technique was applied to the 
20 teachers rated highest by the students onScale 
A, and to the bottom 20. Fourteen of the27 items 
in group 1, and 12 of the 13 in group 2, met a 
significant or near significant level. These 26 
items were retained and validated against the 
student and administrator evaluations of Schools 
II and ΤΠ. 

A secondary source of predictive data con- 
sisted of the teachers’ self-evaluations and cer- 
tain biographical data such as sex, marital sta- 
tus, number of dependents, subjects taught, years 
of age, and years of teaching experience. 


V. Sequence of Activities 


1. Investigator constructed criterion meas- 
ures: Evaluation Scales A, B, and C. 

2. The 91 items were selected for the predic” 
tive measure, the Sentence Completion Test. 

3. The investigator solicited the cooperation 
of secondary schools in this study. 

4. The student and administrator evaluations 
of all classroom teachers were obtained at Scho? 
Iduring one day. The investigator administered 
the predictive measure to those teachers who vol” 
unteered to participate in the standardization aM 
validation of the predictive measure. These te? 
chers also filled out the evaluation scales on them 


June, 1953) 


Selves and completed a personal history blank. 

5. The same procedures as in Step 4 were fol- 
lowed at School II. 

6. At School III, the procedure was altered. 
The investigator appeared before the faculty of 
the school and explained the project and asked for 
volunteers. Those teachers who volunteered con- 
Stituted the participating teachers, and they were 
the only ones evaluated by the students. It was 
hoped that this modification at School III might 
add some significant experience to the problem 
of obtaining adequate criterion measures. The 
predictive measures were obtained in the same 
manner as those in Schools I and II. 

7. While the data were being collected at 
Schools II and III, the investigator proceeded with 
the standardization of Parts I and II of the sen- 
tence completion test. 

8. Chi-square technique was used to refine 
Parts I and II of the sentence completion test. 

9. The sentence completion test scores for 
the teachers at Schools II and III were scored 
“blind” and were validated against the student, 
administrator, and self-evaluations. 


SECTION IV 
THE CRITERION MEASURES 


I. Criterion Measures Used in Study 


THIS SECTION will present the purpose of 
the criterion measures used in this investigation 
and the justification for the use of the raters used 
to evaluate the subjects on the scales used as cri- 
terion measures. 

The following three scales 
ness as used in this study wi 
briefly: 


of teacher effective- 
ll be described 


1, Scale A. A non-itemized and unstructured 
scale of teacher effectiveness. 

2. Scale B. A structured scale to elicit the 
evaluators’ judgment of the teachers' atti- 
tude toward the students. 

3. Scale C. A structured scale to 
the rater's judgment of the ease 
the teacher goes about his teaching. 


measure 
with which 


pt to justify the 


This section will also attem 
es of raters onthe 


use of the following three class 


REED 287 


three scales described above. 


1. A representative sample of student judg- 
ment in each of three secondary schools. 

2. Administrator judgment. 

3. The judgment of the participating teachers 
themselves at the three schools. 


This investigator is assuming that the person 
best qualified to evaluate the teachers’ effective- 
ness is the one nearest the teachers, namely, 
the student. The other person most concerned 
with the teacher is the administrator. A third 
person who in one way is most concerned with 
the teacher is the teacher himself. It was pro- 
posed, therefore, to secure evaluations from all 


three sources. 
IL. The Use of Student Evaluations 


Many investigators have criticized the prac- 
tice of using student judgment as a criterion 
measure because students are presumed to be 
immature and irrational. One instructor 1 asked 
his ninth-grade students to evaluate him two sem- 
esters later on what they liked and disliked about 
him. He was unaware of the first twotraits they 
disliked about him, namely, domineering and no 
sense of humor. He indicated that this was a 
«blow. ’? However, it was a **great satisfaction 
to know they had really learned something, ’’ the 
factor they liked most about him. The implica- 
tion would seem to be that the teacher could not 
accept those traits the students disliked about 
him. If they were true, the fact that the students 
really learned served as the ‘‘great reward which 
lends justification to his existence as a teacher." 
This article also introduced some information on 
our second question concerning the basis for judg- 
Is one to accept the teacher's ability to 
«teach us something"! or should one consider the 
other traits of domineering, and no sense of hum- 
or, as the significant factors? 

Hart? concluded from his survey of 10, 000 
high school students that students were mature 
enough to think straight on the question of teach- 
ers and teaching, and that they could weigh values 
and arrive at reliable and significant conclusions. 
Cook and Leeds ? used fourth to sixth grade pupil 
ratings under the assumption that «they were 
smart enough to evaluate and not as sophisicated 


ment. 


aS erue νο 


Section IV----- 


l. S. Callahan, "Is Teacher Rating by Students 8 Soun 


p. 98. 
2. F. We Hart, Teachers and Teaching (New York: 


3. W. We Cook and C. H. Leeds, 
Measurement, VII (Autumn, 1947), pp. 399-410- 


d Practice?" School and Society, LXIX (19h9), 


The Macmillan Co., 1931), p. 283. 
"Measuring the Teaching Personality," Educational and Psychological 


288 


as high school students. ’’ In their validation study 
they found that the pupil ratings correlated higher 
with inventory scores than the administrators', 
but not as high as the experts' ratings. 

While the educators' philosophy and attitudes 
toward students relative to their right to make 
their own decisions have often been divided, there 
has been comparative agreement among psychol- 
ogists, curriculum makers, and methods workers 
that the students' interests should be considered 
first, and that teaching should be at the students’ 
level. However, the privilege of evaluating the 
teachers’ work has been largely reserved by the 
adult. Only recently has student judgment been 
censidered a reliable criterion measure. There 
is often an element of threat to the authority fig- 
ure in the matter of student evaluation which is 
seldom explored. A consideration of this prob- 
lem usually revolves instead around the maturity 
of the student and whether his judgment can be 
trusted. It would seem to be consistent with our 
growing knowledge and appreciation of the indi- 
vidual’s ability at all ages to evaluate his own 
environment in terms of what stimulates him 
positively and negatively that the individual should 
also share in the planning and execution of the ed- 
ucational process. The executive authority in 
any social institution has been delegated toa few 
for efficient operation but in research, at least, 
consideration might well be given to the student’s 
values. 


IH. The Use of Administrator Evaluations 


The administrator has long been the one to 
evaluate the work of his subordinates. It is not 
proposed in this study to eliminate his function. 
He is responsible for the operation of the school. 
One of his duties is the selection of teachers. If 
the students are not being taught, the teacher is 
not teaching, and the administrator is therefore 
indirectly concerned and directly responsible. 

There seems to be no disagreement concern- 
ing the administrator's function in the evaluation 
process. There is some disagreement regarding 
the reliability of his judgment. If the investiga- 
tor is to use evaluative judgment as a criterion 
measure, he must consider the administrator. 
Some investigators have sought to use pupil growth 
as measured by objective tests and other data for 
their criterion measures, but all they have done 
is to remove direct subjective evaluation. Indirect- 
ly, the administrator's judgment has already been 
considered in the standardization of the objective 
data. 


—————— 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. XXI 


IV. The Use of Self Evaluations 


There is a growing body of knowledge and ex- 
perience which would indicate that the individual 
himself is capable of making evaluative judgments 
of himself. It has long been customary to re- 
view the ratings with the one rated. The purpose 
has been to improve the individual's performance 
and to let him know how he is doing in relation to 
his colleagues and peers. This has been a frus- 
trating experience for both the rater and the one 
rated for two reasons. In the first place, the two 
seldom see ‘‘eye to eye’’ for objective reasons. 
There is the implication that one of the two isless 
right than the other. In the second place, the sub- 
jective defenses of both parties must be consid- 
ered. 

Few would deny that much can be gained by 
soliciting the assistance of the subject in any ἃς- 
tivity. This factor has been overlooked in most 
evaluation projects. On the other hand, itis well 
established clinical practice to use the evaluation 
of the subject of himself. More use of self eval- 
uations is necessary to establish the reliability 
of self evaluations for criterion measures. 


V. The Use of Itemized and Non-Itemized Scales 


Predictive measures must be standardized and 
validated against active teachers. With these data 
of discriminating factors, selection can be pe r- 
formed more efficiently. In order to standard- 
ize the predictive measures, the criterion meas- 
ures must be carefully defined or one must accept 
the evaluator's judgment for whatever reason Or 
reasons are meaningful to him. 

Suchman 4 has pointed out the fact that both 
itemized, or defined, and non-itemized S cales 
have been used in social science research. In 
the non-itemized approach, no attempt is made 
to produce a definition of the variable. 

In the selection and definition of itemized a8" | 
gregates of attributes, the number of character 
izing items that exist for any single variable is 
unlimited. And, as Suchman points out, there in 
little inherent reason why any one item 18 better 
than another. The final decision of whether a” _ 
item characterizes a universe must be a 510166 
tive one. Suchman? challenges the research 
worker to be scientific by translating loose de^ 
Scriptive terminology into more precise class” a 
ficatory systems. Most research efforts, in th 
area of teaching effectiveness have attempted 
define the meaning of some attribute or varia 
in such a way as to permit the classification ° 


le E. A. Suchman, "The Logic of Scale Construction," Educational and Psychological Measurements, x 


(1950), Ρ. 82. 


5. Ibid., p. 79+ 


June, 1953) 


persons according to the degree to which that at- 
tribute is absent or present. This has been the 
problem of scale construction. The selection 
and definition of these ‘‘meaningful variables" 
has been the problem which has contributed to 
much of the confusion. The result has been con- 
clusions such as Baxter'sÓ that there is no one 
pattern-personality of exact or particularized 
characteristics or any single configuration of per- 
sonal attributes which characterizes all effective 
teachers. 7 

Even though Suchman rules out the procedure 
of selecting items on the basis of some correla- 
tional test, 8 he states that there must be an ad- 
equate content interpretation for both acceptance 
and rejection of an item. This would seem to in- 
dicate that any hypothesis for the selection of an 
item is acceptable providing it is submitted to ex- 
perimental evidence. 


VI. The Use of Scale A 


Scale A as a criterion measure of teacher ef- 
fectiveness is considered to be non-itemized in 
that no traits or attributes are suggested to the 
rater. The investigator has deliberately with- 
held any suggestion of structure. It may be con- 
sidered a global approach rather than an atomis- 
tic one in that it purports to establish effective- 
ness by sampling a universe of unstructured sub- 
jective opinion and feelings. The predictive meas~ 
ure may be carefully defined; and, if the predic- 
tive measure used does correlate with the global 
criterion, it may be concluded that some relation- 
ship exists without saying that it is causative. 
Some knowledge has therefore been gained. Con- 
tinued refinement of the definition and repeated 
sampling of the population may further add to our 
understanding of the effective teacher. 


VII. The Use of Scales B and C 


Additional evidence of the effectiveness of non- 
itemized versus itemized scales may be gained 
from a study of the relationship between evalua- 
tions registered on both types of scales by the 
Same population. It was partly for this reason 
that Scales B and C were used in this study. 


6. B. Baxter, Teacher-Pupil Relationship? 


Te Ibid., p. 9h. 

8. Suchman, op. cite, p. Sh. 

9. Baxter, op. cite, p. 36. 

10. Hart, op. cit., p. 136. 

11. C. Re Rogers, Client-Centered' Therapy (Boston: 
12, Baxter, op. cite, pp. 73, The 


13. Hart, op. cit., p. 13h. 


REED 


289 


It is hoped that Scales B and C will also add 
something to our definitions of effective teachers 
as proposed by Suchman. Baxter pointed out 
as a result of her study that one cannot separate 
the teacher's personality from his skill as an in- 
structor, as is suggested by many rating scales. 
She also observed that the effective teacher could 
identify himself with the learner because he was 
ready and willing to forget self and to rejoice 
with the learner in his satisfaction at discovering 
for himself. 

HartlÜ found that Teacher A, the most liked 
one, was human, friendly, companionable and 
‘one of us." Teacher A was also interested in 
the pupils, and understanding. Those were the 
third and fourth reasons, respectively, given by 
the students. 

Roger’s 18th proposition 11 in his theory of 
personality and behavior reflects this concept: 


When the individual perceives and ac- 
cepts into one consistent and integrat- 
ed system all his sensory and visceral 
experiences, then he is necessarily 
more understanding of others and is 
more accepting of others as separate 
individuals. 


These observations and the investigator’s ex- 
perience contributed the framework for Scale B. 
It was desired on this variable to have the eval- 
uator's expression of what he felt was the atti- 
tude of the teacher toward him, in the case ofthe 
student, and what the administrator felt was the 
attitude of the teacher toward his students. Tea- 
cher X on Scale B theoretically represented the 
accepting, understanding, and companionable 
teacher. 

Scale C attempted to incorporate the findings 
of Baxter 12 that the good teacher was poisedand 
able to face conflicting demands without becoming 
hurried or petulant. The good teacher did not 
seem to be actuated by the necessity of having 
his pupils accomplish a given amount of work 
within the shortest time, but was leisurely and 
relaxed in his guidance. 

Hart 13 found that the second reason given by 
high school students for liking Teacher A was 


(New York: Macmillan Coe, 1942), p. 10. 


Houghton Mifflin Co., 1951), p. 520. 


290 JOURNAL ΟΕ EXPERIMENTAL EDUCATION 


cheerfulness, happiness, and a good-natureddis- 
position with a sense of humor. 

The investigator's experience in his observa- 
tion of teachers and counseling with teachers has 
led him to feel that the poor teacher takes his 
work too seriously and seems to be trying too 
hard. It would seem that some degree of these 
traits is necessary for effective teaching, but 
that there is a point of diminishing returns be- 
yond which they are detrimental. This type of 
teacher also seems to take too much responsi- 
bility for the guidance of his students. It seems 
to be very difficult for him to allow the student 
the privilege of learning for the sake of learning. 
It is also difficult for the poor teacher, oratleast 
the teacher who is having trouble with his teach- 
ing, to realize that growth is a functional, active 
process rather than a passive one of being told 
and shown. 


SECTION V 


THE PREDICTIVE MEASURE 


I. Summary of Rationale 


IT WAS indicated in the preceding section 
that the well-adjusted and effective individual is 
assumed to be an accepting individual. The ac- 
cepting individual has been shown to be one who 
perceives his environment to be unthreatening to 
his concept of himself. The individual whose be- 
havior indicates that he is well adjusted and in- 
tegrated seems to be the one who perceives the 
least number of inconsistent or inharmonious el- 
ements in his ‘‘phenomenal field. ’’ Those ele- 
ments which are perceived to be unacceptable to 
the self will be rejected; those elements which 
are consistent with or which fit into the individ- 
ual’s concept of himself will be accepted. 

It is not the purpose of this study to determine 
the teacher's ‘‘phenomenal self” or his self-con- 
cept, at least not for diagnostic or therapeutic 
purposes. The investigator has accepted the di- 
mension of acceptance as a tenable hypothesis 
and he will attempt to submit it to experimental 
analysis in a specific situation, namely, teacher 
effectiveness. 

It is assumed that a projective test can ade- 
quately determine the quantitative degree of this 
acceptance-rejection dimension possessed by the 
teacher. It is also assumed that if this traitcan 


(Vol. XXI 


be projected to a measurable degree, it is detect- 
ed or ‘‘felt’’ by others. It is assumed that Scale 
A will measure the evaluator's reaction to the 
teacher's effectiveness. If some relationship is 
found to exist between the criterion measures and 
the projected trait on the predictive measure, it 
may be assumed that some relationship exists. 

It is further assumed that Scales B and C will 
measure some of the projected trait as detected 
or observed by the evaluator. 

The investigator has further delimited his study 
by not attempting to compare the degree of accep~ 
tance in teachers as compared with other voca- 
tionalgroups. Additional studies may be conduc- 
ted to determine any significant differences with 
respect to this factor. 


IL Available Devices 


There is a certain heirarchy of tests or de- 
vices to be used for screening purposes. During 
the war, attempts were made to screen those in- 
dividuals most likely to break under stress; Zubin, 
in his report on the investigation, 1 pointed out 
the difficulty of predicting stress tolerance from 
miniature experimental stress situations, and. 
stated that group screening techniques have be- 
come a necessity. 

At a lower level than stress situation in war 
is the process of screening the maladjusted from 
the adjusted in a clinical situation. Beyond that 
level is the screening of job applicants. Zubin’ 
indicated that critical scores or items my be. 5187 
nificant in one situation but not in others, and the 
problem of verifiable data becomes more compli- 
cated at the lower levels of intensity. This woule 
mean that discriminating differences among ΠΟΥ͂" 
mals, ” or applicants, would be more elusive: 
At the same time the selection process, as of 
teachers, is an important element in the efficient 
operation and administration of the educational 
program. 

The consensus of writers in the field seems 
to be that rating scales and objective paper an 
pencil personality tests are inappropriate for 
measuring the teacher personality. Some of this 
inappropriateness is due to a narrow definition 
οἳ criterion measures, according to Baxter. 
Structured personality tests of the usual paper | 
and pencil type do not offer access to the perso? 
ality make-up, or its processes, according to 
Hutt. 4 


1. J. Zubin, "Recent Advances in Screening the Emotionally Maladjusted," Journal of Clinica). Psy- 


chology, XVI (1948), p. 57. 
2. Zubin, op. Cites p. 59. 


3. B. Baxter, Teacher-Pupil Relationships (New York: Macmillan Co., 19l2), Ρο 153. 
he M. Le Hutt, "The Use of Projective Methods of Personality Measurement in Army Medical Install- 


ations," Journal of Clinical Psychology, I (April 1945), p. 135. 


June, 1953) 


Rhode 8 pointed out that interviews, question- 
naires, and inventories have certain limitations, 
because of their direct questioning technique, 
which tends to make the individual self-conscious 
and defensive and usually prevents him from dis- 
closing his deeper self. 

. The personality inventory, according to Zubin, 
in its present form was devised to differentiate 
between normal and deviant groups and not for 
differentiating within the deviant group. He goes 
on to say that for this purpose a new group of 
tests is needed, perhaps of the word association 
and other projective types. It might be add ed 
that these same tests are also necessary for dif- 
ferentiating within the normal group, as in per- 
sonnel selection. 


III. The Use of Projective Techniques 


For purposes of this study, the investigator 


6 


felt that the projective test could best measure 


the personality dimension to be submitted to ex- 
perimental analysis. He considered the individ- 
ual’s need for a unitary, consistent, and un- 
threatened personality organization to be a dyn- 
amic mechanism designed to establish a state of 
equilibrium. A survey of available testing tech- 
niques seems to reveal that the projective meth- 
od is better able than the conventional personal- 
5 inventory to sample the personality organiza- 
ion. 


IV. Characteristics of Projective Techniques 


Freud was the first to use the term *tprojec- 
tion." However, as currently used in projective 
tests, projection means more than a defensive 
function, according to Bell; it is also an expres- 
Sive function. Bell? gives the Latin derivation 
as “to cast forward, ’’ whichis the action involved 
inthe technique. He points out9 that the purpose 
of projective techniques is to gain insight into 
the individual's behavior. This also is the pur” 
pose of other personality tests; however, projec- 


5. A. R. Rhode, "Explorations in P 
lied Psychology, XXX (April 1946), pe 169- 


6. Zubin, op. cites p. 59. 

T. J. E. Bell, Projective Techniques (New York: 
8. Ibid., p. 3 

9. Ibid., p. he 


REED 291 


tive tests are global in their approach in contrast 
to the atomistic approach which centers its atten- 
tion upon traits of the personality considered as 
disparate items. As Harriman states: 


The purpose of these procedures is to ob- 
tain an insight into values, wishes, re- 
pressions, emotional organization, and so 
on, which the individual might be unwilling 
or unable to supply if the direct-question 
method were used. 


There are three characteristics of projective 
techniques, according to Bell: 11 


1. Presentation of a stimulus to the subject 
which does not make manifest, or partially makes 
manifest, the real purpose of the examiner. 

2. Sampling individual behavior in a structured 
event of sufficient brevity to be clinically practic- 
able and of sufficient stimulation to call forth a 
wide range of individual responses. 

3. Consideration of the recorded behavior, as 
wellas the personality that produces it, as an 


organized totality. 


The purpose of projective tests, according to 
Korner, 12 is not to predict reality behavior;tests 
merely reflect secondary configurational patterns. 
He cautions test users to realize that tests mere- 
ly record behavior, and all behavior manifesta- 
tions are expressive of an individual's personal- 
ity. The scoring of a test is merely the examin- 
er’s *shorthand'' used to reduce behavior to man- 
ageable proportions; and his clinical insight or 
judgment is only by inference based upon famil- 
iarity with behavior dynamics. 13 

Bray, 14 on the other hand, has stated that 
testers have seldom even attempted to predict 
behavior from their test results. The apparent 
conflict between Korner's and Bray's positions 
seems to revolve around the use of tests as pre- 
dictors of behavior or as diagnostic devices from 
which one may infer certain dynamic tendencies 


ersonality by the Sentence Completion Method," Journal of Ap- 


Longmans, Green and Co., 1918), p. 2. 


10. P. τ. Harriman, The New Dictionary of Ps, nology (New York: Philosophical Library, 1947), pe 


270. 
11, Bell, op. cite, pp. l-6. 


12, A. Fe Korner, "Theoretical Considerations 


Concerning the Scope and Limitations of Projective 


Techniques," Journal of Abnormal and Social Psychology, XIV (1950), p. 623. 


292 JOURNAL ΟΕ EXPERIMENTAL EDUCATION (Vol. XXI 
through clinical judgment. 2. ....the stimulus materials used must be 

It would seem that different stages in the pro- simple and readily available. 
cess of test construction are the occasion for this 3. ....the method must not consume more 


disagreement. The second step after the criter- 
ion selection in the development of prediction in- 
struments, according to Horst, 19 is the assemb- 
ling of data on a group representative of the pop- 
ulation for which predictions are to be made. The 
nature of these data will be controlled by the 
tentative hypotheses held regarding the relation 
between the criterion and various items. Korner 
was apparently thinking of this stage of the devel- 
opment of predictive measurements. 

Horst then goes on to formulate the final three 
steps of combining the data to give a total predic- 
tive score, trying out the results ona check 
sample, and finally, if the results hold up, ac- 
tually using the test as a predictive instrument. 
Bray would say that the validation step, or try- 
out on a check sample, is the important function 
of a test. 

Bell 16 further clarifies the functions of per- 
sonality tests, and projectives in particular, by 
stating that projective devices serve two main 
functions: (1) the offering of rapid, valid, and 
reliable means by which a clinician may arrive 
at a picture of the personality of a subject; and 
(2) the facilitating of personality studies in psy- 
chological research. It is with this latter func- 
tion of projectives that this investigator is con- 
cerned. He is using a projective device to meas- 
ure an aspect of personality and to determineany 
possible relationship that that aspect of person- 
ality may have with teacher effectiveness. 


V. Criteria for the Adequacy of a Projective 
Technique 


According to Bell, 17 any projective technique 
must meet the following criteria: 


1. The first is that the technique must stimu- 
late behavior by the subject in which the differ- 
ent layers of the personality may be manifested 
and, as much as possible, distinguishable. 


time than is proportionate to the value of the in- 
formation received.... 

4. The technique should be easy to adminis- 
Üer.... 
5. The method must be reliable in the sense 
of being able to produce records from an individ- 
ual which are psychologically consistent.... 

6. The interpretations based on the records 
must be valid.... 

7. The techniques should not produce major 
disturbances in personality functioning or act as 
precipitating factors to maladjustment. 


VI. The Sentence Completion Test as a Projec- 
tive Technique 


The types of projectives seem to be limited 
only by the ingenuity of the experimenter. They 
are classified in various ways. The most com- 
mon classification is in terms of the amount of 
structuring in the stimulus. A lump of modeling 
clay represents the completely unstructured med- 
ium, and a photograph represents the structured 
type. Whatever medium is used should give free 
scope to action and should provide the widest pos- 
sible latitude in choice of response or forms of 
expression, according to Symonds. 18 Frank 19 
has classified the media into four types on the 
basis of response: the constitutive, constructive; 
interpretive, and cathartic. Under this system, 
the sentence completion test would be classified 
as a constructive type in which the subject organ- 
izes separate meaningless parts into meaningful : 
wholes. 7 

The most important feature of a projective tech 
nique, as revealed by an analysis of the research 
evidence, is not the type of stimulus provided or 
response given to it, but the interpretation which 
is made of the response. It would seem, there- 
fore, that the most important consideration for 
the selection of a projective device would be its 
use and the extent to which the medium meets 


13. Ibid., p. 619. 


lh. D. Je Bray, "The Prediction of Behavior from Two Attitude Scales," Journal of ΛΏπο:τια!. and 50- 


cial -sychology, XIV (1950), p. Oh. 


15. l. Horst, The Prediction of Personal Adjustment (New York: Social Science Research Council, 
1911), pp. h, 5. 


16, Bell, op. cite, ps h9h. 


17. Tbides pp. i9h-5. 


18. 2. Ie Symonds, "Projective Techniques," Encyclopedia of Psychology, Edited by P. L. Harriman 
(New York: Philosophical Library, Μπα 


19. L. K. Tisi, "Projective Methods for the Study of Personality," Journal of Psycholo 


(1939), p. 402. 


, VIII 


June, 1953) 


the criteria proposed by Bell, as quoted in the 

preceding section. Up to the present the most 
extensive application of projectives has been in 

the diagnosis of deviant personalities. Increas- 
ing use of the technique with normal people has 
been noted. 

The sentence completion type of projective 
technique allows the experimenter to sample the 
subject's projection of his personality without the 
subject knowing what dimension is being meas- 
ured. At the same time, the items may be struc- 
tured in such a way that the experimenter can 
Score or evaluate the responses by some prear- 
ranged plan. This advantage according to 
Forer, 20 allows for a consistent approach to the 
test material. When the stimulus is minimally 
structured, the interpreter lacks sufficient in- 
formation to determine what the response 
means. 21 

It can be concluded from Forer's study of the 
Structured form that the sentence completion test 
can be used for a variety of purposes with reas- 
onable certainty that the responses will reveal 
attitudes or dynamics in the areas intended. Sy- 
monds 22 expressed the Same optimism for pro- 
jective techniques generally. 

The sentence completion technique seems to 
be appropriate to this study, which can be called 
an attitude test as well as a controlled projection 
test. Allport has characterized an attitude as a 
“state of readiness which exerts a dynamic influ- 
ence upon the individual’s responses to all objects 
and situations with which it is related. 23 This 
Study hypothesizes an attitude of acceptance of 
“objects and situations’’ as a * dynamic influence 
upon the behavior of the teacher." There is, 
therefore, in Forer's terms, ἃ preconceived plan, 


20. B. A. Forer, "A utructured ventence Completion 
(1950), p. 16. 
21. Ibid., pe 11. 


22. .'. i. Symonds, t 
XIII (December 19h49), pe 337. 


REED 293 


and the sentence completion test items and scor- 
ing can be structured to meet that demand. 


VII. The Use of the Sentence Completion Test in 
Other Studies 


Payne's 24 and Tendler’s25 studies are gener- 
ally considered to be the first attempts to use the 
sentence completion technique for personality di- 
agnosis. Payne used fifty items in his sentence 
completion test as a personality measure in vo- 
cational counseling. Tendler distinguished be - 
tween the diagnosis of thought reactions and of 
emotionalresponses. His criteria, scoring tech- 
niques, and rationale have contributed a great 
dealto subsequent studies. Tendler's efforts ap- 
plied the same technique to emotional factors 
which had been introduced by Trabue26 as a lang- 
uage scale in 1916, and even earlier by Ebbing- 
haus as a test of intelligence. Guilford 27 indi- 
cated recently that some return to those tech- 
niques was in order. ‘‘I do not now see how some 
of the creative abilities, at least, can be meas- 
ured by means of anything but completion tests of 
some kind. "' 

Tendler used twenty incomplete sentence items 
with 250 college girls and validated the testagainst 
autobiographical character Sketches and the Wood- 
worth Personal Data blank. The items were in- 
tended to stimulate admiration, anger, love, hap- 
piness, etc. 

Little research was done with this device un- 
til World War II. Hutt28 reported on the use of 
the sentence completion technique as a supple- 
ment to data acquired by the psychiatrist through 
other techniques. Holzberg, 29 and Holzberg, 


est," Journal of ;rojoctive Techniques, XIV 


"New Directions for vrojective Techniques," Journal of Consulting .’sychology, 


23. C. We Allport, "Attitudes," in Handbook of Social .’sycholocy, Carl iurchison, Editor (.orces- 
e Wea ο ΠΣ k 
Clark University 1655, 1935), P. 199. 


ter, iiass.: 


9l. A. F. Payne, Sentence Completions (New York: 


25. A. D. Tendler, "A Preliminary Report o 
Psychology, XIV (1930), ppe 123-136. 


-Test Language Scales (New Yo: 


36. Me Re Trabue, Completion- Teni aipa ame 
sity, 1916), p. 110. 

27. J. P. Guilford, "Creativity, 
tember 1950), p. llib. 


28. Hutt, op. cite, ppe 134-140. 


22. J. De Holzberg, "S 


ger Cliie Bulletin, IX (1945), pp. 89-93 


ome Uses of Projective Techniques in Hi 


Guidance Clinic, 1928). 


n a Test for Emotional Insight," Journal of Applied 


rk: Teachers College, Columbia Univer- 


" (President's Address, APA, 1950) American Psychologist, V (Sep- 


litary Clinical Psychology," lennin- 


294 JOURNAL OF EXPERIMENTAL EDUCATION 


Teicher, and Taylor 30 also reported the use of 
the sentence completion techniques of Tendler 
and Shor as a part of a diagnostic battery in mil- 
itary Lo rere hospitals. 

Shor 31 introduced a variation to the sentence 
completion device by structuring the items to 
elicit responses to the common experiences of 
the soldier. This was the first attempt to use 
the device in a particular situation. He also ar- 
ranged his fifty items in a ‘‘definite sequence to 
permit a carryover or generalization of attitude 


(Vol. XXI 


the test on 39 personality categories classified 
under needs, press, and inner states. Remark- 
ably high correlations were reported, the aver- 
age being . 78 for the 50 girls and . 82 for the 
boys. 

The standardization and validation study by 
Rotter, Rafferty, and Schachtitz on college stu- 
dents introduced a unique system of scoring by 
example, and the experimenters concluded that 
such a system introduced the possibility of util- 
izing the sentence completion test for a number 


of screening problems. Their test was validated 
against 82 females classified as adjusted or mal- 
adjusted by instructors and advanced student clin- 
icians; and 124 males, 78 of whom were classi- 
fied by their instructors and 46 who were refer- 
red, or were self-referrals, to the Psychological 
Clinic. Correlation coefficients between test 
scores and classification by the teachers only as 
adjusted or maladjusted yielded biserial correla- 
tions of . 50 for the females and . 62 for the males. 


Wilson used forty items structured to school 
situations and validated the test against seven 
maladjusted boys and girls and fifteen well-ad- 
justed boys and girls in the tenth and eleventh 
grades. A study of the formal aspects of the test 
Showed no consistent differences, However, on 
some items she noted some significant differences. 
The maladjusted students felt that the rules were 
too strict and the examinations unfair, while the 
adjusted students felt the rules were nottoo strict 
and the examinations were hard but necessary. 


from immediate to basic human interest. 292 

Stein33 and Symonds 34 used this technique in 
the Office of Strategic Services as a personnel 
selection device. Stein's test sampled relevant 
information concerning at least ten areas consid- 
ered to be important for personality evaluation. 
Symonds used two tests of fifty items each. Sy- 
monds concluded, even though he used only eight- 
een subjects: 


....the sentence completion test cannot 

be used to differentiate good and bad ad- 
justment by any direct comparison of it- 
ems or by psychometric methods. The 

sentence completion is descriptive and 

not evaluative. 35 


Rotter and Willerman, 36 on the other hand, used 
a forty-item test with patients in an AAF Hospital 
and claimed a validity of +.61 against psychiatric 
judgment of severity. 

Rohde, 37 Rotter, Rafferty and Schachtitz, 38 
and Wilson 39 have applied the sentence comple- 
tion technique to the detection of emotional mal- 
adjustment in students. Rohde modified Payne's 
list and used sixty-four items in a validation study 
of 100 ninth-grade students with teachers' opin- 
ions, interviews, and Evidence Record Data as 


Kelly and Fiske 40 have released a prelimir- 
ary report on the evaluation of clinical psychol- 
ogy trainees, in which 78 of 128 P-1 first-year 
trainees were assessed during the spring of 1947 
in a one-week program with a battery of tests 
and interviews. An evaluation of their work was 
the criterion measures. Using Murray's system made at the end of their second year by the Uni- 
of interpreting behavior reactions, she scored versity and hospital installation supervisors on 


eee 


30. J. D. Holzberg, A. Teicher, and J. Le Taylor, "Contributions of Clinical Psychology to iili- 
tary Neuro-psychiatry in an Army Psychiatric Hospital," Journal of Clinical Psychology, III 
(January 19h7), pp. ϑίι-95. 


3l. J. Shor, "Report on a Verbal Projective Technique," Journal of Clinical Psychology, II (19h6), 
pp. 219-282. 


32. Ibid., p. 280. 


33. li. I. Stein, "The Use of a Sentence Completion Test for the Diagnosis of Personality," Jour- 
nal of Clinical Psychology, III (19h7), pp. h7-56. rome 


Bl. P. lle Symonds, "The Sentence Completion Test as a Projective Technique," Journal of Abnormal 


and Social Psychology, XLII (July 19h7), pp. 320-329. 
35. Ibides: pe 221. 
36. J. D. Rotter and Be Willerman, "The Incomplete Sentences Test as a Hethod of Studying Person- 


ality," Journal of Consulting Psychology, XI (1947), pp. lh3-l8. 


A. Re Rohde, "Explorations in Personality by the Sentence Completion Method," Journal of Ap- 


E 
3 plied Psychology, XXX (April 1946), pp. 169-181. 


June, 1953) 0 


(1) skill in diagnosis, (2) skill in individual psy- 
chotherapy, (3) skill in research, and (4) prefer- 
ence for hiring. Of the four projective tests used, 


....the one showing the most promising 
validities is the Sentence-Completion Test, 
with which our projectivists had had but 
little previous experience. Furthermore, 
the Sentence-Competion Test and the The- 
matic Apperception Test were interpreted 
“blind, 41 


VII. The Use of Projective Tests in Evaluation 
of Teachers 


A study by Alexander 42 has indicated that a 
projective test can predict ways in which teach- 
ers interact with children as revealed by obser- 
vationaldata. Eight pictures of the TAT type 
were used, showing teacher-pupil relations. Al- 
exander concluded that one can predict ways of 
behaving and that these predictions have close 
agreement with observed behavior. 


IX. Summary 


1. The understanding of personality measure- 
ment is bound up with the underlying theoretical 
conception of personality. Projectives are based 
upon a dynamic conception of personality rather 
than a static process. 

2. Personality is structured and all experiences 
are integrated into a pattern consistent for the in- 
dividual. 

3. Behavior is functional and the personality 
structure reveals itself in the behavior of the in- 
dividual consistent with his concept of himself. 
Logical consistency of behavior may be present, 
psychological consistency is always present. 

4. Personality is a depth phenomenon, Sur- 
face manifestations as revealed in observable 
and controlled situations make possible inferences 
regarding the latent structure and content. 

5. Projective techniques are attempts to ex- 
plore the nature of the latent structure and con- 
tent of the personality in order to predict overt 
behavior. 


30. J. B. Rotter, J. E. Rafferty, and E. Schachtitz, 


REED 295 


6. It now seems possible to construct a meas- 
uring device appropriate to a variety of clinical, 
applied, and experimental purposes. 

7. The sentence completion type of projective 
technique allows for freedom of response, and 
at the same time the stimulus can be sufficiently 
structured to permit more meaningful interpre- 
tations of the responses. 

8. Administration of the test is relatively 
simple; it is not time-consuming, and no special 
training is ordinarily necessary. 

9. The sentence completion method now lends 
itsel easily to objective scoring for screening 
or experimental purposes. Personality analysis 
for clinical appraisal and interpretation requires 
the same general skill and knowledge of person- 
ality as is necessary for other projective tech- 
niques. 

10. Even though the subject has a greater op- 
portunity for disguise of purpose in the sentence 
completion test than in other projective devices, 
the advantages of partial control of the response 
allows for meaningful interpretations in certain 
experimental situations. 

11. The reliability of responses and scoring 
is not high but it is within the limits of accepta- 
bility. The validity of the sentence completion 
method has not been high enough to eliminate cor- 
roborative data, but objectification of scoring has 
recently improved the validity. 


SECTION VI 


CONSTRUCTION OF SENTENCE COMPLETION 
TEST 


I. Selection of Items 


IT WAS first necessary to formulate a well- 
defined purpose and objective for the sentence 
completion test (hereafter referred to as SCT)as 
a whole. The hypothesis for this study, as prev- 
iously stated, was to determine the relationship 
between the teacher's attitude of acceptance and 
his teaching efficiency. The SCT was thought to 
be the best device for measuring this attitude. 


"Validation of the Rotter Incomplete Sentences 


Dlank for College Students," Journal of Consulting Psychology, V (1949) pp. 348-356. 


39. I. Wilson, "The Use of a Sentence Completion Test in Differentiating Between Well-Adjusted and 
lialadjusted Secondary School Pupils," Journal of Consulting Psychology, XIII (December 1919), 


pp. 00-402. 
WO. E. τ. Kelly and D. W. Fiske, 


ical Psychology," American Psyc gist, 
li. Ibid., p. hol. 


ha. T. Alexander, Jre, "The Prediction of Teacher-? 


nal of Clinical Psychology, VI (July 1950), pp. 


"Phe Prediction of Success in the V. A. Training Program in Clin- 
holocist, V (August 1950), pp. 395-06. 


Pupil Interaction with a Projective Test," Jour- 
273-276. : 


296 JOURNAL OF EXPERIMENTAL EDUCATION 


Previous studies had established this technique 
as a valid instrument. It also seemed possible 
to structure the items sufficiently to allow for 
meaningful responses without destroying the sub- 
ject's freedom of response. 

The search of the literature revealed few stud- 
ies which had been made in which the experiment- 
er was using the technique for the measurement 
of a specific attitude and with a particular occu- 
pationalgroup. Forer pointed out how this could 
bedone. Furthermore, most of the studies have 
been concerned with the measurement of person- 
ality for clinical and diagnostic purposes. Rotter 
et al., have formulated an objective and econom- 
ical scoring system for screening and experimen- 
tal purposes. 

In this study the investigator established the 
hypothesis that the effective teacher was one whose 
personality organization was such that he could 
accept into his self-concept and phenomenological 
environment experiences perceived to be enhanc- 
ing as well as threatening. The ineffective teach- 
er, on the other hand, would have a tendency to 
reject the threatening experiences and acceptonly 
the enhancing ones. In order to measure the tea- 
cher's personality organization on this dimension 
of acceptance-rejection, it was necessary to ob- 
Serve several principles common to sentence com- 
pletion tests in general and to this SCT in partic- 
ular. 


1. The stimulus items, or phrases, should 
elicit an emotional response rather than a thought 
response. 

2. They should elicit many responses of a dis- 
criminating nature. 

3. They should stimulate the subject to pro- 
ject his self-concept into the responses. 

4. They should, in part, be structured to elicit 
responses peculiar to the educational process. 

5. They should elicit responses which can be 
Scored in a reliable and valid manner. 


It was found that the incomplete sentence words 
or phrases used in the tests of Forer, Tendler, 
and Rohde met the five principles above. The in- 
vestigator felt at the same time that theSCT should 
contain more items designed to elicit responses 
which would satisfy the fourth principle—-educa- 
tional situations. 

The list of 91 items finally selected for the 
SCT were made up from the following sources: 
(See Appendix C in original thesis) 


B. R; Foret... 57 
InyesHigalor. «ios okama 20 
A.B. Robdescesceccerss 11 
A, D. Τεπά]θα........-...---ως 3 

91 


1. E. McGinnies, 
Social Psychology, XLV (1950), p. 28. 


(Vol. XXI 


Forer had classified his 100 items under: (1) 
inter-personal figures, (2) dominant drives, (3) 
causes of own emotional responses, and (4) re- 
actions to emotionally stimulating situations. A 
sample of each category was selected for the SCT. 
The 20 items constructed by the experimenter 
were in large part structured to elicit attitudes 
toward students, teachers, parents, and school 
Situations. A few items were suggested by other 
experimenters and writers. Rohde's items elic- 
ited reactions to other people, authority, work, 
future situations, and sources of pleasure, envy, 
and ambition. Two of Tendler's items sampled 
Sources of annoyance, and one sampled the source 
of satisfaction. ! 

The selection of these 91 items was done pri- 
marily on an empirical basis as a result of the 
experience of others. The determination of wheth- 
er these items would measure the hypothetical 
personality organization was theoretical. Only 
through a process of standardization of the scor- 
ing system could it be determined which item 
responses were discriminating ones. It will be 
seen in the rest of this section that 26 of the 91 
items could be used as significant ones inthe val- 
idation process. 


IL Preliminary Assumptions for Scoring System 


A review of the studies using the projective 
technique in personality measurement revealed 
certain phenomena which may impair the validity 
of this technique, or may increase its validity if 
the investigator can take advantage of them. “η 
either event, one must be cognizant of them in 
order correctly to score and interpret projective 
data. 

l. The investigator must have some assurance 
that the subject is revealing his true self, or at 
least the dimensions of personality that are bein£ 
examined. One of the chief criticisms of the 
paper and pencil objective techniques has been , 
the lack of assurance that the subject was respon 
ing to the test stimuli in a manner consistent with 
this conception of himself. If the examiner wis e 
to consider this discrepancy as a characteristic 
of the subject's behavior, there was little that — 
could be done about it. Some of this characteris 
tic might be revealed in a discussion of test re 7 
sults with the subject. However, this evaluation | 
would be quite subjective. The proponents of p° 
jective techniques claim that they are able to 
take advantage of the subject’s efforts to ‘‘look 
good’’ in the test situation. a 

McGinnies 1 has called attention to the distint 
tion between perception as occurring at the level 
of implicit response, and reaction as overt ob- 
servable response. This investigator contends 
that perception is one way of responding and the 


"Personal Values as Determinants of Word Association," Journal of Abnormal and 
cournal oi Abnormal and 


June, 1953) ^ 


perception of a stimulus may delay or even pre- 
vent reaction by raising the perceptual threshold. 
In any attempt to measure an aspect of personal- 
ity organization, the investigator would there- 
fore need to anticipate this possibility and adjust 
his technique to it. 

Spencer 2 found that among 192 high school 
Students, 22 percent of the subjects admitted that 
they would have left some of the questions in an 
experience appraisal blank unanswered if their 
Signatures had been requested, and about 9 per- 
cent said they would have answered some of the 
questions untruthfully. Spencer observed that 
the latter group also had the highest average con- 
flict score. If such evasion is typical of apprais- 
alblank responses, it is necessary for the inves- 
tigator to allow for it in his scoring of any device, 
or preferably to use it as a significant factor. 
The flexibility of projective techniques has been 
Suggested as one of the advantages of that meth- 
od. 

Another aspect of the problem raised by Mc- 
Ginnies was investigated by Carter, 3 who found 
that changes in palmar skin conductivity and re- 
action time as measured by the galvanometer 
were significantly greater in individuals with 
problems and in psychoneurotics than in normals. 
However, the oral responses of the control and 
experimental groups varied little on a modifica- 
tion of the Tendler Emotional Insight Test. This 
would seem to indicate that the perceptual thresh- 
old for verbal stimuli was higher than that for 
physiological stimuli, and, therefore, devices 
for measuring reactions to verbal stimuli were 
not so effective as those measuring reactions to 
physiological stimuli. Carter does not, however, 
indicate how the investigator would determine the 
source of the neurosis or the specific problem 
troubling the subject. 

2. A problem related to that above is the one 
of clinical judgment and clinical intuition as con- 


2. D. Spencer, 
chology, XXIX (1938), ppe 26-35. 


3. H. J. Carter, "A Combined Proje 
Certain Affective Processes," J 


he 
Psychology, I (1949), pp. 34-38. 


5. E. McGinnies, "Personal Values as Dete 
Social Psychology, XLV (1950), PP- 28-56. 


ctive and Psychogalvanic Ri 
ournal of Consulting Psychology, XI (1947), pP. 210-215. 


REED 297 


sidered by Klehr, 4 who showed that clinical judg- 
ment, which is not entirely intuitive, was com- 
parable to objective scoring. Fifteen experienced 
clinicians and a control group of graduate students 
were used to measure the scatter patterns on an 
equal number of normals and two groups of clin- 
icalcategories. Both the clinicians and graduate 
students demonstrated results which were signif- 
icantly better than chance. Klehr attributed this 
ability to training and experience. 

3. A problem in projective techniques, and 
especially in the sentence completion form, is 
the determination of whether the subject reveals 
more about himself when he is the subjector 
when.others are the subject. McGinnies5 con- 
cluded that a subject will respond sooner to a 
word symbolizing his highest value area than he 
will to a word symbolizing his lowest value area. 

Sacks 6 found that the first-person form yield- 
ed five out of six significant differences, in his 
study of one hundred Veterans Administration 
patients. This may be consistent with the “‘high- 
est value’’ words as noted by McGinnies in that 
the person’s highest value is himself. 

4, The influence of response sets upon the val- 
idity of personality tests has been examined by 
many investigators. Cronbach7 has defined a 
response set as ‘‘any tendency causing a person 
to give different responses to test items than he 
would when the same content was presented in 
different form, ’’ and has found 8 that response 
sets are most influential in those situations where 
the subject is allowed to define the situation, and 
that they should, therefore, be avoided except in 
projective devices which capitalize on ambiguity. 
He makes a further allowance for projectives by 
noting that response sets are to a small degree 
correlated with external variables such as atti- 
tudes, interests, and personality. 

Rundquist 9 seemed to discredit the effect of 
response sets in a study in which 111 factory girls 


"The Frankness of Subjects on Personality Measures," Journal of Educational Psy- 


esponse Technique for Investigating 


H. Kehr, "Clinical Intuition and Test Scores on a Basis of Diagnosis," Journal] of Consult 


rminants of Word Association," Journal of Abnormal and 


6. J. M. Sacks, "The Relative Effect Upon Projective Responses of Stimuli Referring to the Subject 
" Journal of Consulting Psychology, XIII (1949), pp. 


τ of Stimuli Referring to Other Persons, 

-20. 

Τε Le Je Cronbach 
VI (igh6), p. h75. 


"Response Sets and Test Validity," Educational and Psychological Measurement, 


8. Le J. Cronbach, "Further Evidence on Response Sets and Test Design," Educational and Psycho- 


logical Measurement, X (1950), Ρ. 21. 


298 JOURNAL OF EXPERIMENTAL EDUCATION 


were asked to indicate how well 200 descriptive 
words and phrases applied to them, and immed- 
iately afterward how well they liked or disliked 
each of 100 activities. He found a consistency 
represented by a correlation of . 4, and conse- 
quently doubted that response sets revealed any- 
thing basic about the individual. Rundquist indi- 
cated that ‘‘the correlation was largely a function 
of the type of material, directions, mood, or 
some other temporary condition. ’’ 

The present investigator has attempted to 
measure the subject's tendency to project a 
certain attitude—acceptance or rejection— in 
Several situations as structured by the stimulus 
phrases of the incomplete sentences. He has al- 
so been aware of the other problems raised above, 
and while the answers are still hypothetical, he 
has proceeded on the assumption that his test 
design represented feasible possibilities. An ex- 
amination of the findings may confirm them or 
may indicate better procedures. 


II. Criterion for Standardization of Scoring 
System 


The students’ evaluations of the 21 teachers 
at School I on Scale A, teaching effectivness, 
were used as the criterion measure for standard- 
izing the scoring system. Scale A was selected 
because the investigator felt that it was the only 
one of the three scales that could be supported by 
previous experimental evidence; and also because 
it would have thegreatestapplicability for screen- 
ing purposes. A high correlation between the 
test scores and Scales B and C would have exper- 
imental significance, but little or no practical 
significance unless it were known how the same 
sample of teachers were evaluated on their teach- 
ing ability. The student evaluations were used 
in preference to administrator ratings because 
the primary purpose of this study was to examine 
the relationship between the teacher's attitude of 
acceptance, or permissiveness, and his teaching 
effectiveness in the classroom. Those mostcon- 
cerned with this relationship and in the best posi- 
tion to evaluate it were the students. 


9, E. A. Rundquist, "Response Sets: 


(Vol. XXI 


IV. Review of Rationale 


The rationale for this study as contained in 
Section II has been drawn from the contributions 
of those who contend that the individual's person- 
ality is organized around his attempt to maintain 
a state of balance or unity that is consistent with 
his concept of himself and his perceptual envir- 
onment.10 Emotional states are related to goal- 
directed behavior, and the kind and intensity of 
the emotion are related to the perceived signif- 
icance of the behavior for the maintenance and 
enhancement of the organism. The unpleasant 
and/or excited feelings accompany the goal-seek- 
ing effort of the individual, and the calm and/or 
satisfied emotions accompany satisfaction of the 
need. 

Those experiences which tend to enhance the 
organism are assimilated or accepted into a con^ 
sistent relationship with the concept of self; those 
experiences whichare inconsistent with the org- 
anization of self are perceived to bethreats, and 
hence the organism tends to reject them or to de- 
fend itself against them. Under certain conditions 
involving the absence of threat, experiences which 
are inconsistent with the self-concept or self- 
structure may be accepted through a revision of 
the self-structure. When the individual perceives 
and accepts into one consistent and integrated syS~ 
temallhissensory experiences, then he is neces” 
sarily more understanding and accepting of others. 

This attitude of acceptance is not, according 
to Prescott, 11 to be confused with blind, passive 
acceptance, which produces dependence, the p^ 
posite of maturity. Sheerer 12 has said that an 
accepting person has internalized certain values 
and principles which serve as a general guide for 
behavior and relies upon this guide rather than 
conventions or standards of other individuals, 2D 
does not hate, reject, dislike, or pass judgment 
against others when their behavior seems to be 
in contradiction to his own. 

This rationale has been reduced in this study 
to the hypothesis that the accepting person iS 4 
more effective person, and the accepting or Pet” 
missive teacher is a more effective teacher; 


A Note on Consistency in Taking Extreme Views," Education- 
al and Psychological Measurement, X (1950), p. 90. 


10. C. Re Rogers, Client-Centered Therapy (New York: Houghton Mifflin Co., 1951), Ch. 11. 


D. Snygg and A. We Combs, Individual Behavior (New York: Harper and Bros., 199),Chs. l| and 5. 


11, D. Λο Prescott, Emotion and the Educative Process (Washington, D.C.: American Council on Edu- 
TE. nn 


cation, 1938), p. 105. 


12. E. A. 8heerer; "An Analysis of the Relationship Between Acceptance of and nespect for Self and 
Acceptance of and Respect for Others in Ten Counseling Cases," Journal of Consulting Psychol- 


ogy, XIII (19h9), pp. 170-171. 


June, 1953) - 


the hon-accepting person or teacher. 


V. Rationale for Scoring Part 113 


The scoring system devised by Rotter and Raf- 
ferty served as the basis for scoring the SCT. 14 
Essentially their system consisted of scoring 
their test from examples contained in the manual. 
Each item response was assigned a weight from 
0 to 6 and an over-all score was obtained by to- 
taling the weights of each item. The scoring ex- 
amples were illustrative of certain principles, 
and were not intended to contain all possible sen- 
tence completions. 

Rotter's principles involved a distinction be- 
tween conflict and positive responses. The con- 
flict responses were those indicating an unhealthy 
or maladjusted frame of mind. The responses 
were scored according to the severity ofthe con- 
flict or maladjustment expressed and they were 
assigned a weight of 4, 5, or 6. The positivere- 
sponses were those indicating a healthy or hope- 
ful frame of mind and they were scored 0, 1, or 
2, depending upon the degree of good adjustment. 
Between the conflict and positive responses were 
those Rotter designated as neutral ones, which 
did not fall clearly into either the positive or neg- 
ative. They generally lacked emotional tone or 
personal reference, ΟΥ the responses were de- 
Scriptive and were found to be characteristic of 
both the adjusted and maladjusted. The neutral 
responses were scored 3. 

In addition to the basic principles referred to 
above, Rotter established certain other principles 
to clarify the scoring of many questionable r e- 
sponses which would inevitably arise in such a 
system. These principles were designed to cov- 


er omitted and fragmentary responses, those in 
qualification, those re- 


which the subject adds a 
sponses in which the subject expresses more 
feeling than indicated by the example, and those 
instances in which the subject gives an unusually 
long response. 
The main distinction between Rotter's system 
of scoring and that of this investigator is in the 
definition of the scoring intervals, OT the scoring 
principles. Rotter conceived of a scorable dif- 
ference in intensity of response within each of 
the positive and conflict types. This investigator 
defined the scoring intervals in terms of. ego dis- 
tance. According to the rationale for this study 
stablished in Section II, there is a hierarchy of 


REED 299 


situations or experiences in one's environmental 
field, as they relate to the individual's self-con- 
cept. In one respect, it is easier for an individ- 
ual to accept impersonal situations than other 
people. Likewise, it is easier for the individual 
to accept other people than it is to accept him - 
self. Inanother respect it can be assumed that 
it requires less ego strength for the individual 
to accept impersonal situations than it does to 
accept others and self. The reaction of the indi- 
vidual to a perceptual stimulus can be said to be 
in terms of his interpretation of that stimulus as 
threatening or enhancing. If he interprets it as 
threatening to his self-concept, he will reject it; 
if it is perceived to be enhancing, he will accept 
it. Whether the stimulus is reasonably interpret- 
ed as threatening or enhancing is assumed to be 
dependent upon the individual's ego strength or 
the degree of his personality organization and in- 
tegration resulting from an acceptance of self. 

It was decided by this investigator to scorethe 
SCT responses according to the basic principles 
indicated above. An experience involving an im- 
personal situation was assumed to exist in the 
subject's environmental field (as defined by Snygg 
and Combs) and therefore was thought to be less 
ego involving than those experiences involving 
other people. Likewise those experiences which 
directly involved the self-concept were potentially 
more threatening, OT ego-involving, than those 
involving other people. 


The investigator set up a Scoring system in 
which the individual's attitude toward self re- 
ceived the maximum weight of 6 for self-accep- 
tance and 0 for self-rejection. It was assumed 
that a greater degree of ego strength or person- 
ality organization was involved in experiences 
requiring self-acceptance, and hence, it Should 
receive a score of 6. Likewise, those exper- 
iences simulated in the stimulus phrases which 
were threatening to the self-concept and there- 
fore rejecting should receive the maximum pen- 
alty, or 0. On this same dimension, acceptance 
of an experience through the projected response 
which indicated an acceptance of others would 
receive a score of 1. Acceptance of an imper- 
sonal situation was assigned a weight of 4 and 
rejection of it a weight of 2. Those responses 
which Rotter called **neutral"" were defined as 
ambivalent by this investigator and given aweigh 
of 3. 


abisoa in Section T, Dero ss a Memare ot 1 AS 


13. The investigator first attempted 
School I. It soon became apparent 
the basic principles ο 
After the scoring system had been 8 
items were examined again and became Pi 
parts is the same. 


tandardized 


to score the 91 


ontained in this subdivis: 


items for the 21 participating teachers at 
that 31 of the items could not be scored on the basis of 
ion and subdivision V, and they were rejected. 
for the remaining 60 items as Part I, the 31 
art II of the SCT. However, the rationale for the two 


lh. J. B. Rotter and J. E. Rafferty, Manual, The Rotter Incomplete Sentences Blank, College Form 
al Corporation, 1950), Che Ile 


(New York: The Psychologic 


300 JOURNAL OF EXPERIMENTAL EDUCATION 


The seven-step scale, 0 to 6, used by this 
investigator was, therefore, similar to that used 
by Rotter with the exception of the definitions of 
the seven steps or intervals, and also the fact 
that the. SCT was scored on a linear or unidimen- 
Sional basis of acceptance. It was also assumed 
that the scale was an equal-interval one. There 
seemed to be no experimental evidence to the ef- 
fect that the distance between self-acceptance 
and acceptance of others was more Significant 
than that between acceptance of others and accep- 
tance of situations. 


VI. Scoring Principles for Part I 


With this basic frame of reference, certain 
Scoring principles were established. 


1. Attitudes toward self 
a. Responses to stimulus phrases which indi- 
cated that the subject was accepting him- 
Self, or projecting a positive attitude about 
himself, were scored as 6. 
b. Responses which indicated a self-rejecting 
attitude were scored 0. 


2. Attitudes toward others 
a. Responses which indicated that the subject 
was accepting others were scored as 5. 
b. Responses which indicated a rejecting atti- 
tude of others were scored as 1. 


3. Attitudes toward situations 
a. Responses which indicated an accepting at- 
titude toward impersonal situations or ex- 
periences were scored as 4. 
b. Rejecting responses toward situations and 
experiences were scored as 2. 


4. Ambivalence 
a. There were no stimulus phrases inthetest 
which were structured to elicit an ambiv- 
alent or evasive response. However, if 
the subject chose to project an evasive re- 


Sponse, or used a catch phrase, stereotype, 


or a song title, the response was evaluated 
as ambivalent and scored as 3. 

b. The ambivalent response was not consid- 
ered to be inconsistent with the rationale 
outlined above. That area where the sub- 
ject crosses over from acceptance to re- 
jection was thought to be significant. The 
subject who uses an ambivalent response 
does so for some dynamic reason. 


As the investigator progressed with the stand- 
ardization of the scoring system, certain prob- 
lems presented themselves for which consistent 
answers were imperative. While it was desir- 
able to reduce the scoring of the SCT to a simple 
and objective basis, it was not always possible 


(Vol. XXI 


to do so. A certain amount of controlled judg- 
ment was necessary 


1. It was necessary to determine whether the 
responses should be scored objectively for their 
intrinsic meaning, or whether a subjective inter- 
pretation of latent meaning was indicated. It was 
determined that all responses should be scored 
on the one dimension under investigation—accep- 
tance. To interpret the responses otherwise 
would be unreliable for purposes of this investi- 
gation. Only one thought was kept in mind, frus- 
trating though it was, namely, is the subject ac- 
cepting or rejecting himself, others, or situa- 
tions? 

2. Most stimulus phrases were structured to 
elicit a projected attitude toward self, others, or 
Situations. It was noted that the subject would 
occasionally twist the response to include a ref- 
erence to self when normally it would be consist- 
ent to refer to others. ‘‘I feel that people.... 
like me” would be scored 6 for self-acceptance; 
however, the response, ‘‘....are interesting" 
would be scored 5 because the subject responded 
to the structure of the stimulus and made an ‘‘ac- 
cepting” response. 

3. Qualifications were often made by the sub- 
ject. The subject was free to respond in any 
way he desired, and the response had to be scored 
on the basis of what was said. In “The students 
in this school are....good considering their par- 
ents, ’’ the subject responded to ‘‘students’’ which 
the stimulus was structured to do, but the subject 
introduced ah attitude toward ‘‘parents’? which, 
was more emotionally loaded, and therefore it 
was scored 1 instead of 4 as it would have been 
had he not qualified it. 

4. Tense was not considered to be significant. 
**When people contradict me....it used to make 
me furious’’ was scored as 0 even though it might 
have been reasoned that contradiction did not make 
him furious at the time he took the test. Again, 
the subject was free to say that contradiction did 
not bother him; however, he did not and, there- 
fore, the response was scored as given. 

5. Cultural semantics were evaluated in terms 
of current usage. '*When I meet a parent. ... I 
expect most anything’ was considered to be a re” 
jecting response and was scored 1. “Most wom^ 
en act as though....they are demure” was score 
3 for ambivalence. . 

6. Responses with a religious quotation or im^ 
plication were scored 3 for ambivalence. 

7. Omissions or fragmentary responses were 
not scored, although an allowance was made for 
them with a correction factor as used by Rotter, 
S= πο XScore, in which the total score (8) 
was equal to the number of items (N) divided by 


the number less the omissions times the obtaine 
score. 


June, 1953) 


^ 


All items in the SCT did not elicit responses 
which were representative of each of the seven 
intervals in the scoring scale. However, Item 
32 did contain the following sample responses: 


32. When people contradict me.... 


6. I don't mind it too much. 
I have learned to accept it. 
Iam willing to reconsider my views. 
Iam ready to defend my statement. 


5. I try to find out if they have a good reason. 
I feel it is their right. 


4. L attempt to find out why. 


3. It is usually disconcerting, sometimes 
Stimulating. 


2. 1 ignore it if courteously done. 


1. It makes me more positive in maintaining 
my position. 
It places me on the defensive. 
I stop talking. 
I don't like it. 
It is apt to irritate me. 
I become angry. 
I become embarrassed. 
It makes me feel funny. 


Vii. Rationale and Scoring Principles for Partl 


of SCT 


The 31 items which had been difficult to score 
with the original principles which were effective 
for the 60 items in Part I were examined again. 

It was desired to use the same rationale and there- 
fore it was necessary to devise a different set of 
Scoring principles. . 

[δει σα ον divided the teachers 1n the 
Standardization group, School I, into the effective 
and ineffective teachers according to the students 
evaluations on Scale A, teaching effectiveness. 
He then noted the responses to the 31 items which 
Were given by each group. In general, the items 
Were structured to elicit projected drives and 
Sources of satisfaction and annoyance. The effec- 
tive teachers’ responses on some of the items 
Seemed to differ from those of the ineffective 
teachers’ responses in several respects. (1) The 
responses of the effective teachers Were consid- 
ered to be legitimate or acceptable reaction feel- 
ings. (2) They showed ego strength and accept- 
ance of responsibility without being self-reject 
Ing. (3) They demonstrated acceptance of self 


REED 301 


and others, while the responses of the other 
group were more critical and showed a tendency 
to moralizing. (4) The responses of the effective 
teachers seemed to reflect a broader and more 
abstract phenomenological self without being evas- 
ive, while the other teachers were revealing a 
more concrete and personalized structure in a 
negative direction. 

Those responses which were characteristic of 
the teachers evaluated above the median on Scale 
Α were given a plus score, and those responses 
characteristic of the teachers below the median 
were assigned a negative score. An inspection 
of the scores for the 21 teachers on the 31 incom- 
plete sentences revealed a significant trend on 
13 of the items. 

The plus responses were interpreted as char- 
acteristic of effective teachers or well-integrated 
and accepting personalities. This criterion of 
effective behavior seemed to be similar to the 
principles established by Tendler. 15 He inter- 
preted the responses to his test on the basis of 
positive and negative ego or social reference. 

The ‘‘ego positive’’ responses contained an as- 
sertive reference while the ‘‘ego negative" was 
self-depreciating. The ‘social positive” dem- 
onstrated an interest and feeling for others while 
the ‘‘social negative’’ responses revealed a fault- 
finding attitude. 

This investigator found that the effective tea- 
cher responded to the phrase *'I pity....” with 
statements such as ''.... the boy; poor people; 
the unfortunate. " The ineffective teacher was 
more apt to respond with ‘‘....a selfish person; 
the poor teachers; a person with no self-reliance. ”? 
The stimulus phrase “1 was most depressed when 

."' elicited positive responses from most of 
the effective teachers, such as ‘‘....out of work; 
I felt I was not teaching properly.’’ The negative 
responses were more concrete and trifling: **.... 
I heard news of the election; the students didn’t 
care to learn; the fog came. ”’ 

This investigator believed that these positive 
responses reflected a better personality organ- 
ization and a stronger self or ego-structure than 
those responses judged to be negative. He also 
believed that the positive responses were elicited 
py dynamics similar to those scored as accept- 
ance responses in Part I. 


VI. Reliability of SCT 


The common methods of establishing reliabil- 
ity— test-retest and split halves —were not con- 
sidered to be appropriate to this study. The var- 
jous aspects of personality and tendencies to be- 
have are more readily modified by experiences 


15. A. D. Tendler, "A Preliminary Report on a Test f 
chology, XIV (1930), ppe 122-136. 


or Emotional Insight," Journal of Applied Psy- 


302 JOURNAL OF EXPERIMENTAL EDUCATION 


and fluctuations in perception. The test-retest 
technique would measure changes in behavior 
rather than reliability. This investigator secured 
criterion and predictive measures at the same 
time, or reasonably so, in order to reduce the 
influence of temporal changes to a minimum. If 
this had not been done, the conclusions drawn 
from the experiment would have been less defen- 
sible, as one would then never know how much 
changes in experience and perception had influ- 
enced the results. Any significant correlations 
resulting from this study can, therefore, be used 
to predict teaching effectiveness from scores on 
the SCT. 

The split-half technique was not considered 
to be appropriate since the items on an incom- 
plete sentence blank are not considered to be e- 
quivalent. Any attempt to establish equivalence 
would be on an a priori basis, and a high correl- 
ation of reliability would merely indicate that the 
two parts happened to be equivalent. The inves- 
tigator could not know why there was internal con- 
sistency. In fact, there is little justification for 
the split-half technique except that empirically 
it has been found that the two methods yielded ap- 
proximately the same results, Therefore, the 
split-half technique saves several weeks’ time 
consumed by the test-retest method. 

Reliability of scoring the SCT, however, was 
Considered to be very important. While consis- 
tency on the part of the subject was not consid- 
ered to be a problem in projective personality 
measurement, it was necessary to establish in- 
ter-scorer reliability. The investigator deter- 
mined reliability of scoring in two ways: 


1. Phase I; Consistency of Scoring 445 rep- 
resentative item responses for Part I between 
the investigator and five other scorers. 

2. Phase II: Consistency of total scores be- 
tween the investigator and two other scorers for 
Parts I and II of the SCT. 


For Phase I of the reliability procedure, a 
compilation was made of 445 representative re- 
sponses to the 27 items which met the inspection 
Standards of significance as indicated in Section 
V. These scoring examples were to be used in 
the validation procedure of Schools II and IH. It 
was not expected that every possible response 
to the incomplete sentences would be found in the 
responses of the teachers in School I. The scor- 
ing by example could only hope to establish by 
illustration the basic scoring principles, 1 udg- 
ment would be necessary in scoring those re- 
sponses for which there was no example. 

Each of the five raters who were selected to 
Score the 445 sample responses had had some 
teaching and counseling experience. Three of 
the five were graduate students in clinical psy- 
chology. 'Two of the five had had training in ed- 


(Vol. XXI 


ucation primarily and some training in psychol- 
ο 3 

S rhe five raters were given a list of the items 
to be scored with only the principles of scoring 
as contained in subdivision IV of this Section. 
They were asked to evaluate each response item 
and determine whether it should be scored 6, 5, 
4, 3, 2, 1, or 0. Each rater scored the items 
independently. 

Part I of Table II shows the degree of agree- 
ment between the six scorers, including the in- 
vestigator. Allsix scorers, on 255 responses 
or 57.3 percent, agreed that the item was either 
an accepting or rejecting one. If the response 
was scored as 6, 5, or 4 it was considered to be 
an acceptance response. If it was scored 2, 1, 
or 0, it was considered to be a rejection response. 
When the ambivalent score of 3 was added to 
either the acceptance range, 6-4, or the rejec- 
tion range, 2-0, the six scorers agreed on 136 
additional items, or 30.6 percent. There was 
disagreement on whether the response was pro- 
jecting an attitude of acceptance or rejection in 
54 items only, or 12.1 percent. There seemed 
to be little difficulty in the recognition of a differ- 
ence between an expressed attitude of acceptance 
and one of rejection. 

Part ILof Table II shows the number of scor- 
ers who agreed on one interval score. All six 
Scorers were in complete agreement on 181 re- 
sponses, i.e., all six scored the item as 6, 5, 
etc. Five out of six agreed on one intervalscore 
for 84 responses. At least three of the scorers 
agreed on 99 percent of the item responses. Ww 

It was though that the range of scores assigne? 
to each response would be indicative of the inter 
scorer reliability, Part IN of Table Π shows that 
in 83. 8 percent of the scored responses there 
Was not more than two points difference between 
the six scorers. 

The responses to the stimulus phrase ‘‘When 
people cohtradict me... . "' are illustrative of the 
differences noted between the six scorers. The 1 
response ‘‘....I wonder about their background, 
Was scored as a rejection of others and given à 
Score of 1 by all scorers. Tothe response ''.»:' 
I become embarrassed, ” five scorers assigne 
à Score of 0, assuming that the response was 2 
Self-rejecting one. One scorer judged it to be 8 
projection to a situation and therefore scored it 
as 2. The same scorer was alone on two other 
responses to the same stimulus phrase, ''...«Ι » 
become angry" and **,.. . it places me on the de 
fensive." He scored them as 2, while the other 
five scored them as 0. This difference in inter 
pretation would account for the larger number 0 
responses with a range of 2 points than of one 
point. 
ty discriminating between self and others, bu 
they did show a differentiation of the situationa 
responses as just noted. 


The scorers seemed to have little difficul" 


June, 1953) 
REED 303 


° 


TABLE I 


INTER-SCORER RELIABILITY— PART I (PHASE I) 


Number of 
Scorers Responses Percent 
1. Agreement between six scorers 
Acceptance or rejection 255 57.3 
Acceptance or rejection with 
ambivalence score 136 30.6 
Disagreement _54 12.1 
Total 445 100. 0 
2. Agreement on interval score between 
six scorers 
All six 181 40.7 
Five out of six 84 18.9 
Four out of six 103 23.2 
Three out of six 72 16.2 
Two out of six 9 1.0 
Total 445 100. 0 
3. Range of interval scores Between κ 
six scorers 
0 181 40.7 
i 62 13.9 
2 130 29.2 
3 27 6.1 
4 36 8.1 
5 6 1.3 
6 zs. Ki 
445 100.0 


Total 


304 JOURNAL OF EXPERIMENTAL EDUCATION 


The response ‘‘....I ignore it if courteously 
done’’ is an illustration of the disagreement be- 
tween acceptance and rejection. Four of the scor- 
ers gave it a score of 2, indicating a rejection of 
the situation, while one gave it a score of 6, as- 
suming that the response showed ego strength or 
self-acceptance. One judged it to represent a 
rejection of others and scored it as 1. 

In spite of the shadings of interpretation pos- 
sible in each response, the reliability of scoring 
seemed to be rather high. Each scorer indicated 
that he had some difficulty evaluating the response 
according to the scoring principles only. There 
Was a frequent temptation to interpret the re- 
sponse dynamically for latent clinical content. 
Some also expressed the feeling that personal bi- 
as entered into their scoring. In Spite of these 
factors, the inter-scorer reliability is sufficiently 
high to justify the Scoring system. 

The second phase of establishing the reliabil- 
ity of the scoring system consisted of correlating 
the total scores of the investigator and two other 
Scorers for ten randomly selected tests among the 
validation group, Schools Π and III. After the in- 
vestigator had scored all of the 83 tests blind for 
the teachers at Schools II and IH, he submitted 
every eighth test to two professors of educational 
psychology, both Fellows in the Division of Clin- 
ical and Abnormal Psychology of the American 
Psychological Association. They were given a 
copy of the scoring principles and list of Scoring 
examples to assist them in Scoring the ten tests. 
It will be noted from Table III that the rank order 
correlations between the investigator and the two 
Scorers, A and B, indicate that the SCT can be 
Scored reliably from Scoring examples. 

Phases I and II for establishing inter-scorer 
reliability would indicate that the investigator had 
established a consistent and understandable Sys- 
tem of scoring the SCT. Phase I of the reliabil- 
ity procedure was considered to be a most signif- 
icant one. (1) The five scorers had available only 
the general scoring principles. The fact that the 
investigator and the five scorers were in agree- 
ment on the difference between acceptance andre- 
jection would show that the dimension of person- 
ality under investigation could be recognized. (2) 
Their ability to distinguish between different lev- 
els of acceptance and rejection as indicated by 
their general agreement on interval scores would 
seem to show that ego distance was a measurable 
quantity. (3) The six scorers for PhaseI were 
not considered to be atypical of those now respon- 
sible for teacher trainee and employee selection. 
(4) The inter-scorer agreement would tend to 
give a high estimate of internal consistency for 
the SCT. 

Phase Π of the reliability procedure would 
seem to indicate that the total score for the SCT 
could be used ín the determination of a cut-off 
Score for selection purposes. 


(Vol. XXI 


SECTION ΥΠ 


CRITERION MEASURES — FINDINGS 
AND CONCLUSIONS 


THIS SEC TION will present the findings 
and conclusions revealed in the analysis of the 
criterion data used in this investigation. Section 
VIII will offer the findings from the predictive 
measures. 

The investigator was primarily concerned 
with the influence of a dimension of personality 
upon teacher effectiveness. It was, ther efore, 
necessary to establish reliability of both criter- 
ion and predictive measures before any valid 
conclusions could be drawn from this study. i 

The criterion measures used in this investi- 
gation served a primary purpose of correlating 
the SCT against a measure of teacher effective- , 
ness. A secondary purpose of the criterion meas 
ures was to study some of the factors involvedin 
sampling procedures. 

This section will present the findings in three 
parts: (1) reliability of the criterion measures, 
(2) sampling differences between participating 
and non-participating teachers, and (3) miscel- 
laneous findings. 


I. Reliability of Criterion Measures 


The criterion measures consisted of the stud- 
ent, administrator, and self evaluations onthree 
scales, A, B, and C. Scale A, teacher effec- 
tiveness, was considered to be the most useful 
measure for purposes of this investigation, aS 
the investigator was primarily concerned with 
the relationship between the teacher's attitude of 
acceptance and his effectiveness as a teacher. 7 
A global evaluation of effectiveness on an unstruc 
tured and non-itemized scale was believed to be 
the best procedure to sample the raters’ feelings 
of the teachers’ effectiveness. 

The student ratings were the only ones for 
which reliability could be established, and only 
Scale A was used for this purpose, There was 
no reason to believe that the reliability for Scale? 
B and C would be different. School III was use 
Íor convenience; and again, there was no reason 
to believe that the students at School III would 
differ significantly from those at Schools 1 and 
II. 


Reliability was determined as follows: Pairs 
of mean ratings on the 32 teachers at School Ill " 
were obtained for two groups of randomly select 
ed students. The reliability coefficient was .75- el 
When this correlation was corrected by the Spea” 
man-Brown formula for a typical class of thirty h 
students, a correlation of . 88 was obtained, whic’ 
would indicate that student evaluations were ἃ ΤΘ 


liable criterion measure to use in this investig2~ 
tion. 


June, 1953) 


G] 


Scales B and C were used to determine if fac- 
tors measured in those two scales had an influ- 
ence on teacher effectiveness as measured by 
Scale A. There was also the possibility that 
Scales B and C might correlate higher than Scale 
A with acceptance as measured by the SCT. Sec- 
tion IV raised the question of whether the three 
Scales differed fundamentally in their purpose or 
rationale. Although it was not the purpose ofthis 
investigator to answer categorically the question 
of what constitutes the effective teacher, it was 
thought that Scales B and C might conceivably 
conclude some elements which were contributing 
to our understanding of the effective teacher. 

The relationship between Scales A, B, and C 
for all participating teachers as evaluated by the 
students can be seen from the following product- 
moment correlations: 


Scales A and B .66 
Scales A and C . 72 
Scales B and C . 16 


The following conclusions may be drawn from 
these inter-correlations: 


1. All three scales are to some extent meas- 
uring the same or similar factor or factors. 

2. The effective teacher, according to the 
students, tends to be the one who accepts the stu- 
dents and trusts them (Scale B), and who also 
goes about his teaching with a minimum of strain 
and effort (Scale C). 

3. Some of the factors which the students used 
to evaluate their teachers’ effectiveness were 
also used in evaluating their teachers’ attitudes 
toward them and their teachers’ ease of teaching 
and sense of humor. 

4. A response-set established in Scale A 
caused the students to evaluate their teachers in- 
discriminately on Scales B and C. 


The reliability of the criterion scales was al- 
so studied with reference to inter-rater Consis- 
tency at Schools II and III. 

It can be seen from the product-moment cor- 
relations in Table IV that there was little consis- 
tency between the three classes of raters. Each 
group of raters was apparently evaluating the 
teachers according to a different standard. The 
students and administrators showed a closer re- 
lationship than did the teachers’ self evaluations 
with either students or administrators. In fact, 
the students and administrators at School III show- 
T a near significant reliability on Scales B and 


These data would indicate that composite rat- 
ings were possible between students and admin- 
istrators, but that self evaluations were not prac- 
tical. The evidence from Table IV would further 
indicate tht the investigator would be forced to 


REED 305 


accept the criterion evaluations of one group of 
raters only. Because the student ratings were 
found to be a reliable measure, and also because 
the scoring system for the SCT was standardized 
on the student evaluations of teacher effective- 
ness, it would seem that for purposes of this in- 
vestigation student ratings were the only reliable 
measure to be used. 


Il. Validity of Predictive Measure 


A preliminary evaluation of the validity of the 
SCT as a predictive measure of teacher effect- 
iveness may be helpful at this point in order to 
clarify the design used in this investigation. 


1, The literature has revealed a growing con- 
viction that behavior in specific situations may 
be predicted from projected verbal attitudes and 
feelings as expressed in response to semi-struc- 
tured stimulus words and phases. 

2. The rationale for this study consisted of a 
belief that the effective person felt secure and 
unthreatened. The students at Schools I and I 
and the administrator at School III consistently 
rated the participating teachers higher than the 
non-participating teachers. The administrators 
at Schools I and Π reversed the trend noted above. 
(The non-participating teachers at School IIl were 
rated by the administrator only. ) 

This same tendency of the lower rated teach- 
ers, as evaluated by the students and one admin- 
istrator, to rejecta threatening situation was al- 
so noted in the tendency of the lower rated teach- 
ers to reject the evaluations of themselves. A 
significant number of the 33 who did not rate them- 
selves, but who did complete the SCT, were rat- 
ed below the mean by the students. It would, 
therefore, seem to indicate that the ineffective 
teacher from the students' viewpoint is also the* 
one who could not accept the threatening situa- 
tions involved in this study. It was also found 
that a significant difference existed between the 
mean scores on the SCT of the 71 teachers who 
evaluated themselves and the 33 who did not. 

3. The remainder of this study (Section VIII) 
is devoted to establishing validity for the SCT by 
product-moment correlations between the SCT 
and the various criterion measures. 


II. Sampling Differences Between Participating 
and Non-Participating Teachers: 


The reliability of the sample used in this study 
can be approached from an analysis of the differ- 
ences between the teachers who were evaluated 
only and those who were evaluated and also par- 
ticipated in the standardization of the predictive 
measure. This investigator was eager to know 
if the criterion measures would reflect the influ- 
ence of the personality dimension under investi- 


306 JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE III 
INTER-SCORER RELIABILITY — PARTS I AND II OF SCT 
(PHASE II) 
Scorers Correlations 
PartI 
Investigator and A . 86 
Investigator and B .91 
Α and B .95 
Part II 
Investigator and A .84 
Invesitgator and B . 89 
A and B .90 
TABLE IV 


INTER-RATER RELIABILITY —CRITERION MEASURES 


Scales 


Scale A 


Students-Administrators 
Students-Self 
Administrators-Self 


Scale B 


Students- Administrators 
Students-Self 
Administrators-Self 


Scale C 


Students- Administrators 
Students-Self 
Administrators-Self 


School II 


-.22 


School III 


June, 1953) 

gation, namely, acceptance. It was recognized 
that participation in this project constituted a 
challenge to the personality structure of the tea- 
cher. The design of the study provided evalua- 
tions on all classroom teachers at Schools I and 
II by students and administrators. At School III, 
all classroom teachers were rated by the admin- 
istrator, and the students rated only the partici- 
pating teachers. If any significant differences 
were noted on the criterion scales, it could con- 
ceivably mean that the accepting teacher could 
also accept this threatening assignment. It would 
also mean that sampling procedures in any study 
of this type would have to recognize this possible 
Source of error. 

It will be noted from Table V that the students 
at Schools I and II and the administrator atSchool 
III consistently rated the participating teachers 
higher than they did the non-participating teach- 
ers. There were, however, only two significant 
differences, Scales B and C at Schooll. Scale 
C reflected the trend more than the other two 
Scales. Apparently those teachers who go about 
their teaching with the least effort were also most 
cooperative in this investigation. 

The administrators at Schools I and Hi reversed 
the trend noted in the student evaluations at those 
Schools in that the administrators were almostas 
consistent in evaluating the participating teachers 
lower. None of the differences was found to be 
significant, but the trend is evident. 

It would, therefore, seem evident that the ef- 
fective teacher had a tendency to be less threat- 
ened by the project used in this study and was 
more willing to accept the inconvenience and pos- 
sible exposure. 1 15 also indicated that any in- 
vestigator should be cautious in assuming that 
voluntary participants are representative of the 
whole population. 

A comparison of the mean criterion scores 
for the 71 participating teachers who completed 
the self-evaluations and the 33 who participated 
but did not evaluate themselves shows the same 
trend in Table VI as noted in Tables V and VIII. 

The 71 participating teachers who completed 
the self evaluations were rated higher by the stud- 
ents on all three scales than the 33 who did not 
evaluate themselves. The administrators, how- 
ever, reversed the students’ trend by evaluating 
higher the 33 participating teachers who did not 
evaluate themselves. 

The students and administrators did not know 
at the time they evaluated the teachers which tea- 
chers were participating, but they were both able 
to distinguish the two groups of teachers. How- 
ever, the standards by which they rated the tea- 
chers apparently differed. 

If the accepting person is one who can accept 
a threatening situation such as a self evaluation, 
and the effective teacher is an accepting person, 
then obviously the students are better able to se- 


REED 307 


lect the accepting person and the effective teach- 
er than the administrators. 

The sex of the teachers was apparently not a 
contributing factor to whether the teacher partic- 
ipated or not. An equal number of male and fe- 
male teachers volunteered at School III. AtSchool 
I, 50 percent of the male teachers and 40 percent 
of the female teachers volunteered. At Schoolll, 
the female teachers volunteered in larger num- 
bers than did the male teachers, 64.5 and 61 per- 
cent respectively. 


IV. Miscellaneous Findings 


An examination of the mean scores from Table 
VII indicates that the evaluations of the teachers 
were consistently above the middle score of 5 on 
the 9-point scales. These above-average scores 
reveal a negative skewness as typically found in 
distributions of ratings. The distributions on 
Scale A were generally less skewed than those 
on Scales B and C. 

The consistency of the criterion measures be 
tween the different raters and the different 
schools can be noted from Table VE. 

Section A of Table VIII indicates that all ex- 
cept one of the critical ratios between the stud- 
ents’ and administrators’ ratings are significant 
at the 5 percent level or better. The students 
consistently rated the teachers higher than the 
administrators did. Only at School II on Scale A 
was the difference not significant. 

An examination of Sections B and C of Table 
VIII shows that Schools II and III differed signif- 
icantly on all of the student measures and on two 
of the three administrator measures. Insignifi- 
cant differences are noted between schools I and 
Π on the student measures, and on Scale B only 
are the administrators’ measures significantly 
different. Schools I and III show approximately 
the same relationship. 

Some explanation for this trend can be gained 
from Table IX. 

The students at School III rated their teachers 
higher on all three scales than did the students 
at either of the other two schools. There is al- 
so a consistency to be noted in the administrators' 
ratings. The ratings at School II were highest, 
with School III next and School I lowest. 

It would therefore seem that the criterion 
measures used in this investigation were consis- 
tent within each school as compared with each 
other. This fact may indicate that the criterion 
samples have a certain point of reference which 
is peculiar at each school. One cannot interpret 
these findings to mean that the teachers are bet- 
ter at one school than they are at another. How- 
ever, it can be safely concluded that according 
to the rank order of mean score$ the students 
and administrators maintained their relative po- 
sitions on all three scales and thus demonstrated 


308 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. 


TABLE V 


RELIABILITY OF THE DIFFERENCE BETWEEN CRITERION MEASURES FOR PARTIC- 
IPATING AND NON-PARTICIPATING TEACHERS 


Critical Ratios* 


Schools Scale A Scale B Scale C 
School I 

Students (0) 79 (+) 2. 62 (+) 2. T2: 

Administrators (3) 1.22 . 00 (=) -3T 
School II 

Students (+) si (+) «9 (+) 1.43 

Administrtors (+) . 04 (-) «85 (-) «8 
School III 

Administrators (+) . 83 (+) .91 (+) 1.91 


* The critical ratios in this table and subsequent ones were computed from the data con- 
tained in Table Ix. 
** The (+) sign indicates that the participating teachers, were evaluated higher than the 


non-participating teachers, as noted in Table IX. The (-) sign indicates that the par- 
ticipating teachers were evaluated lower. 
***Significant at . 01 level. 


TABLE VI 


COMPARISON OF MEAN SCORES FOR TEACHERS WHO DID AND DID NOT 
COMPLETE THE SELF EVALUATIONS 


(N = 71) (N = 33) 


Teachers Not 
Teachers Evalu- 


Evaluating 

Mean Scores ating Themselves Themselves 
Students’ Mean Scores 

Scale A 6.6 6.5 

Scale B 7.4 7.3; 

Scale C 7.6 7.4 
Administrators’ Mean Scores 

Scale A 5.8 6.2 

Scale B 5.9 6.5 

Scale C 6.1 6.4 


XXI 


Tune, 1953) REED 
TABLE VII 
RESULTS OF CRITERION MEASURES 
Schools N Scale A σ Scale B σ Scale C c 
School I 
Students’ Evaluations 
Participating Teachers 21 6.51 1.0 7.2 62 7.46 πο 
Non-Participating Teachers 25 6.26 1.04 6. 62 87 6.68 1. 17 
Difference + .25 + .58 + .78 
Administrators’ Evaluations 
Participating Teachers 21 5. 63 1. 45 5.58 1.61 5, 85 1.31 
Non-Participating Teachers 25 6.16 1.4 5.58 1.22 5.98 1. 03 
Difference - .53 . 00 - .13 
Self Evaluations 15 6.21 1.47 7.2 1.51 7.46 1.21 
School IL 
Students’ Evaluations 
Participating Teachers 51 6.23 . 98 7.12 . 82 7.18 1.02 
Non-Participating Teachers 31 6.2 1.16 6.94 . 88 6. 85 1.23 
Difference + .03 + .18 + .33 
Administrators’ Evaluations 
Participating Teachers 51 5.97 1.09 6.51 1.17 6.49 1.16 
Non-Participating Teachers 31 5.96 .98 6.74 1.18 6.57 1.19 
Difference + .01 - .23 - .08 
Self Evaluations 28 6. 07 1.36 7.36 1.34 7.18 1.44 
School III 
Students' Evaluations 
Participating Teachers 32 6. 72 1.10 7.15 .9T 8.25 66 
Administrators’ Evaluations 
Participating Teachers 32 5.8 1.12 5. 88 1.22 5. 87 1.1 
Non-Participating Teachers 70 5.64 1.27 5. 64 1.23 5.39 1.23 
Difference + .16 + .24 à 48 
28 5. 65 1.32 8.14 1. 06 7.86 1.33 


Self Evaluations 


JOURNAL OF EXPERIMENTAL EDUCATION (Vol 


TABLE VIII 


RELIABILITY OF THE DIFFERENCE BETWEEN THE CRITERION 
MEASURES FOR PARTICIPATING TEACHERS 


| Critical Ratios 
Scale A Scale B Scale C 


A. Students and Administrators 
School I 
School II 
School III 


B. Students' Evaluations and 
Different Schools 
Schools I and II 
Schools II and III 
Schools I and III 


C. Administrators’ Evaluations 
and Different Schools 
Schools I and II 
Schools II and III 


2.4** 4.2* 4. 9* 
1.2 3.1* 3.3* 
3.3* 6.7. 10. 83 


m 
o 
A 

[em 

e 


2.0** 2.6* 5. 8* 
7 1.1 3. ἀκ 
9 2.4 1.9 
7 2.4 2. 6x 


Schools I and III 14 7 a 


* Significant at . 01 level. 
**Significant at . 05 level. 


TABLE IX 


RANK ORDER OF SCHOOLS ON EVALUATION SCALES 


Rank Scale A 
1 IH 

2 I 

3 II 

1 π 

2 πι 

3 I 


Scale B Scale C 
Students 

n m 

I I 

I π 

Administrators 
I 
IH 


ΗΕ 


I 


. XXI 


June, 1953) ^ 


Meet consistent response set or point of refer- 
ce. 

A related aspect of the differences between 
raters is the possible influence of grade leveland 
sex of the students. The implication has been 
made in the literature that students are not ma- 
ture enough to evaluate their teachers realistic- 
ally, and further that the younger students are 
less able to rate. This investigator has noted 
that few studies have attempted to determine the 
differences, or sets, between grade levels. 

From Table X it can be seen that the teachers 
of all three grades at School II were rated high- 
est by the tenth graders on all three scales. With 
the exception of Scale A, they were also rated 
higher by the twelfth graders than they were by 
the eleventh graders. Inasmuch as the same 
teacher was evaluated by all three grades, one 
of the contributing variables would have to bethe 
students’ grade level. It is possible that the 29 
teachers were more effective with twelfth grade 
Students, but it is not probable that that would be 
the most significant variable in this instance. 

The only significant difference between the 
three grades was that between the tenth and 
twelfth graders on Scale A. There wasa near 
Significant difference between the tenth and elev- 
enth graders. Had the twelfth graders rated the 
teachers lower than the eleventh graders, as they 
did on Scale A, there might have been an import- 
ant difference between Scales B and C. 

While there is a tendency for the tenth graders 
to rate higher, one cannot conclude that they are 
less right. Neither can one conclude that, be- 
cause the ratings of the eleventh and twelfth grad- 
ers were more like those of the administrators, 
the older students were more right. 

The reliability of the differences between male 
and female students in their evaluations of their 
teachers showed no significant differences on 
Scales A and C, as noted in Table XI. However, 
On Scale B the female students evaluated their 
teachers much higher than did the males—criti- 
cal ratio, 3.41. Apparently the female students 
feel that their teachers respect and trust them 
more than do the male students. 

The investigator endeavored to determine the 
feasibility of using self evaluations as criterion 
measures, Any conclusions to be drawn from 
the findings will necessarily be weakened by the 
fact that only 71 of the 104 participating teachers 
completed the criterion measures. 

Table XII reveals that the students generally 
rated the teachers higher and the administrators 
rated them lower than the teachers rated them- 
Selves. The teachers at School πι rated them- 
Selves higher on Scales B and C than did the tea- 
Chers at Schools I and II, but they also rated 
themselves lower on Scale A. The only signifi- 
cant difference between the student and the teach- 
er ratings was on Scale A at School III, where the 


REED 311 


students rated the teachers higher = 
ers rated themselves lower men heer be 
Ior School Il. There were significant differences 
on Scales B and C between the administrator and 
self evaluations at all three schools, where the 
teachers consistently rated themselves higher 


V. Summary 


1. Student ratings of teacher effectiveness 
were found to be reliable, with a correlation of 
.88 based on a class of 30 raters. 

2. Inter-scale correlations of student ratings 
showed that the effective teacher tended also to 
be an accepting teacher (correlation between 
Scales A and B, . 655). The effective teacher 
was also one who taught with ease and with a 
sense of humor (correlation between Scales A 
and C, .72). The teacher felt to be the most ac- 
cepting by the students was also the one who taught 
with ease (correlation between Scales B and C, 
.16). 
3. No relationship was found to exist between 
the ratings of the students, the administrators, 
and the teachers’ ratings of themselves (Table 
IV). 
4 A positive tendency was noted on all three 
scales for the students to evaluate the participat- 
ing teachers higher than those teachers who did 
not participate in the standardization of the SCT 
(Table V). The administrator ratings at School 
IH showed the same trend, while the administra- 
tors at Schools I and II showed α tendency torate 


the non-participating teachers higher on allthree 


scales. 
5. No sex difference was not 
of the teachers to participate in 


tion phase of this study. 

6. All distributions of evaluation scores show- 
ed a negative skewness with distributions on Scale 
A generally showing less skewness (Table VII). 

7. A consistent tendency was noted for the 
students to evaluate the teachers significantly 
higher than did the administrators. The teach- 
ted themselves higher than did the ad- 


ers also ra 
ministrators, significantly so on Scales Band 


C (Table VII) 
8. The rank order of mean scores by schools 


for both students and administrators was main- 
tained on all three scales (Table IX). 

9, The tenth grade students showed a tendency 
to rate their teachers higher than did the other 
two grade levels, but only one significant differ- 
ence was noted, on Scale A (Table X). 

10. The male students showed no discrimin- 
ating difference from the female students onScales 
Aand C. However, on Scale B, the male students 
felt less accepted and trusted than did the female 
students (Table XI). 

11. It was noted that the participating teach- 
ers were somewhat frustrated and anxious about 


ed on the tendency 
the standardiza- 


JOURNAL ΟΕ EXPERIMENTAL EDUCATION 


TABLE X 
STUDENT EVALUATIONS OF TEACHERS OF ALL THREE GRADES 
SCHOOL II * 
Grades Scale A Scale B Scale C 
Mean Scores 
10th Grade 6.9 7.4 7.4 
11th Grade 6.3 1.0 7.0 
12th Grade 6.2 17.2 Tok 
Critical Ratios 
Difference between: 
10th and 11th Graders 2.0 1.6 1.3 
llth and 12th Graders .4 zi .4 
10th and 12th Graders 2.5** Lil 1.0 
* N = 29 teachers. 
**Significant at . 05 level. 
TABLE XI 
MALE AND FEMALE STUDENTS' EVALUATIONS OF TEACHERS 
SCHOOL πα 
Scale A Scale B Scale C 
Males—Mean Score 6.3 6.9 710 
Females—Mean Score 6.4 7.4 7.0 
Difference » i .5 0 
Critical Ratio .2 3. 4** 0 


* N = 61 teachers. 
**Significant at . 01 level. 


(Vol. XXI 


June, 1953) 


completing the evaluations of themselves. The 
fact that 33 of the 104 teachers did not rate them- 
selves on all three scales might be interpreted 
to mean that they could not accept themselves or 
the situation. Section VIII will show that a sig- 
nificant percentage of the 33 teachers were found 
ei accepting, according to their scores on 

12. The self evaluations in this study were 
not considered to be reliable measures because 
of the large number of teachers who declined to 
rate. themselves. 


SECTION VIII 


PREDICTIVE MEASURES —FINDINGS 
AND CONCLUSIONS 


THIS SECTION will present the relation- 
Ship between the scores on the SCT and the var- 
ious eriterion measures. The first part will 
present the product-moment correlations for the 
standardization group, School I. The sec ond 
part will present the correlations between the 
criterion measures and the SCT scores after the 
Investigator had scored the tests blind. " The 
third part will present the correlations of the self 
evaluations with the SCT scores. The fourth part 
Will present the correlations between the 8C T 
Scores and certain biographical data. 

The purpose of the SCT was to measure the 
Subject's attitude of acceptance, a dimension of 
the personality organization. Sec 
the construction, scoring principles, and the re- 
liability of the predictive measure. TheSCT was 
divided into two parts, Part I and Part II, onthe 
basis of scoring technique. It was considered 
that the parts were similar in principle and that 
both were measuring the same aspect of the per- 
Sonality organization. 

Part I consisted of 27 incomplete senten 
Vaich remained from the original 91 items after 
το had been eliminated as difficult to score, and 

1 had been eliminated by inspection. Part II of 
the SCT contained 13 items from the 31 that had 
been eliminated as difficult to score on the basis 
of the 7-point scale used in scoring PartI. The 
items in Part Π were scored as positive or neg- 
ative responses, and the positive responses were 
hen a score of 3, equivalent to the mid-point of 

e Scoring scale used in Part I. 

After the investigator had standardized the 
Scoring system of the SCT on the participating — 
ους at School I, he scored the tests ‘‘blin 

Or the teachers at Schools II and ΠΠ. 
40 The product-moment correlations between the 
ο έν items and the students’ Scale A at all 
hree schools were slightly improved by submit- 
ing each test item to the chi-square test of Sig" 
nificance, Fourteen of the 27 items in PartI 


ces 


tion VI described 


REED 313 


and 12 of the 13 items in Part II met the.201evel 
of significance or better when the SCT scores of 
the ten highest-rated teachers of the total 104 were 
compared with those of the ten lowest-rated teach- 
ers. The student evaluations on Scale A, teacher 
effectiveness, were used to select the top- and 
bottom-rated teachers. Parts I and II of the SCT 
were combined to form Part ΤΠ. All of the cor- 
relations cited in the rest of this section will be 
based upon the SCT scores obtained after the re- 
finement of Parts I and II as described above. 
Part I of the SCT will consist of the scores ob- 
tained on the fourteen items scored from 0 to 6; 
Part II will be the scores obtained by multiplying 
the number of positive responses by 3, and Part 
III will be a combination of the scores for Parts 


Iand Π. 


1. Standardization Group— School I 


The normative data were obtained from the 21 
participating teachers at School I. There was no 
reason to believe that this sample was inany Way 
atypical of the larger sample used in this study. 
The scoring system for PartI was in large part 
devised independently of the ratings, and so it can 
be said that the scoring of the items was not biased 
unduly by the student ratings on Scale A. Parth 
was scored quite deliberately on the basis of the 
student ratings for Scale A and therefore it can 
be said there was some bias in favor of student 
judgment. 

Table XIII presen 
for the normative gro 


ts the correlation coefficients 
up at School I. All of the 
student evaluations on Scales A and B correlated 
significantly with scores on Parts I, Il, and ΤΠ. 
Near significant correlation coefficients were ob- 
tained on Scale C. Only one of the administrator 
evaluations WaS significantly correlated with SC 
scores, namely, Part I of the SCT and Scale A. 
It will be noted that the student evaluations 
correlated better with Part I of the SCT than they 
did with Part II. This was also true for the ad- 
ministrator evaluations on Scale A. However, 
part I improved the correlations on Scales B 


and C slightly. 
It can be seen that a definite relationship ex- 


ists between the teacher's attitude of acceptance 
as measured by the SCT and the teacher's effec- 
tiveness aS measured on Scale A by the students. 
It can be concluded that a tendency exists for the 
effective teacher, according to pupil judgment, 
to be an accepting person. 

The significant correlation coefficients for 
Scale B and the SCT scores would also indicate 
that the students feel that the accepting person is 
also the teacher who seems to trust them, accept 
them, and have confidence in them. Apparently 
the well-integrated teacher as evidenced by an 
attitude of acceptance demonstrates this attitude 
ina manner which is felt or perceived by the stud 


314 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI 


TABLE XII 


ΑΝ ANALYSIS OF THE DIFFERENCES BETWEEN SELF EVALUATIONS 
AND OTHER CRITERION MEASURES 


School N MeanA MeanB MeanC 
I 21 Student 6.5 1.0 7.2 .6 1.5 8 
15 Self 6.2 1.5 1.2 1.5 7,5 1.2 
CR af 0 0 
15 Self 6.2 1,5 7,2 1.5 {ΠΩ 1.2 
21 Administrator 5.6 1.5 5.6 1.6 5.9 1.3 
CR 1.1 3.0* 3.6* 
i 51 Student 6.2 1.0 7.1 .8 1.2 1.0 
28 Self 6.1 1.4 7.4 1.3 7.2 1.4 
CR -Β .9 0 
28 Self 6.1 1.4 7.4 1.3 (1.2 1.4 
21 Administrator 6.0 D. 6.6 1.2 6.5 1.2 
CR 3 2. 8* 2.2** 
II 32 Student 8.7 LT 7.8 1:0 8.3 at 
28 Self 5.7 1.3 8.1 1.1 7.9 1.3 
CR 3.3* 1.5 1.4 
28 Self D. T 1.3 8.1 1.1 1,9 1.3 
Administrator 5.8 d. 5.9 1.2 5.9 1.1 
CR «8 7.6* 6.2* 
* Significant at . 01 level. 
**Significant at . 05 level. 
TABLE XIII 


PEARSON PRODUCT-MOMENT CORRELATIONS BETWEEN SCT SCORES 
AND CRITERION MEASURES —SCHOOL I* 


Evaluations Scale A Scale B Scale C 


Part I (14 Items) 


Student Evaluations .85** ΤΟΣΑ .43 
Administrator Evaluations . 49** .15 .18 


Part II (12 items) 


Student Evaluations «515. .4b** .38 
Administrator Evaluations .22 .16 .39 


Part ΤΠ (26 items) 


Student Evaluations . TO** . 52** .41 
Administrator Evaluations E .20 .34 


* N - 21 teachers. 
**Significant at . 05 level or better 


Tune, 1953) 


ents. 
eui > coefficients between student 
A Te on Scale ο and SCT scores do not indi- 
ο. the accepting person is also the teach- 
ο... his class in an easy manner. 
lios mri there is a definite tendency at School 
nifi is to be true, as indicated by the near sig- 
icant relationship. 
E correlation coefficients for the adminis- 
in cs reflect the same tendency as noted above 
even Pri of the student evaluations. How- 
enayi e positive relationship between the tend- 
ie dee the accepting person to be an effective 
msi r was not 8ο marked in the case of the ad- 
istrators’ ratings as in those of the students. 


Π. Validation Group— Schools II and III 


"T SCT was validated on groups of teachers 
devel did not include any of the subjects used in 
ám TUNE the scoring principles and scoring ex- 
in in Scoring of the tests was done ''blind" 
tést hee investigator never knew whether the 
or cce was supposed to be that of an effective 
ο. ective teacher. It was believed that the 
ieee ak School I was similar to 
in "à School I. There were enough differences 
dmm ee procedure at School II to raise some 
Ero about the reliability of the two validation 
com ps and so the correlation coefficients Were 
cc separately for Parts I and ΤΠ of the 
oc aple XIV indicates that significant correla- 
να were obtained on both sample 
valent for teacher effectiveness, Scale A, as 
fici uated by the students. The correlation coef- 
ients of 454 and . 596 obtained on Parts I and 
edad Scale A at School II are insignificantly dif- 
ue c from those of . 625 and . 722 at School III, 
n the difference in number of teachers 1n 
ach sample. 
ΜΝ ci correlations on 
PartI ion of a near significant cor 
μας, School II, would confirm 
ients at School I. The correlati 
às er practically as high for Sc 
‘All ed were for School I. 
C were ος student correlation coe 
from in ipee better than would b 
Seala ia Ξε A comparison of the results on 
Schools H from Table XIII and XIV show that 
e nor and III reversed the trend at School I, 
at Senate, group. None of the correlations 
icant ve was significant, while all were signif- 
ences w chools IL and III. However, the differ- 
each ere negligible considering the number in 
New 
" pk us also be noted from Table XIV that the 
ent with t ators at Schools II and III were consist- 
Strated hose at School I in that they also demon" 
positive correlations but generally insig- 


Scale B, with the 
relation for 
the trend es- 
on coeffic- 
hools II and 


fficients on Scale 
e expected 


REED 315 


nificant ones. 

In another test for validity of the SCT i 
found that the SCT could successfully foal the 
effective teacher from the ineffective teacher 
The 104 participating teachers were divided into 
two groups on the basis of the average mean rat- 
ing of 6. 44 for the students’ evaluation on Scale 
A. At School I it was found that a cutting score 
of 63 on Part III of the SCT correctly identified 
64 percent of the effective teachers and 86 per- 
cent of the ineffective teachers. At School Ithe 
cutting score identified 69 percent of the effec- 
tive and 80 percent of the ineffective teachers; 
and at School III, 100 percent of the effective f 
teachers and 69 percent of the ineffective teach- 


ers were identified. 


After combining the three schools, it was 
found that the cutting score.of 63 identified the 
effective and ineffective teachers equally well, 
78 percent. At Schools II and III, the validation 
group, the cutting score correctly identified 82 
percent of the effective teachers and 77 percent 


of the ineffective teachers. 


used as a screening device, 

it would therefore, be able to serve its purposes 
adequately. This validity is more significant 
when it is remembered that the screening of norm- 
als is considered to be more difficult than differ- 
entiating between normal and abnormal personal- 


ities. 


If the SCT were 


the rationale of this investiga- 
tion, the threatened or insecure person has a 
tendency to reject any stimulus whichis notread- 
ily incorporated into his frame of reference or 
phenomenal field. The evaluations of the 5 elf 
on the three criterion scales were considered to 
be threatening. Other investigators have noted 
this phenomenon when the subject isasked to 
identify or to describe himself in relation to 


some situation. 


According to 


It was noted in Section VII that the teachers 
who rejected the self evaluations were rated low- 
er by the students and one administrator than 
were the teachers who accepted the self evalua- 
tions. The SCT was also able to identify the tea- 
chers who rejected the self-evaluations. The dif- 
ference in mean scores of 4.72 on the SCT be- 
tween the 71 teachers who accepted the self eval- 
uations and the 33 who rejected the self evalua- 
tions was significant at the . 01 level with a crit- 
ical ratio of 2.64. The fact that a difference that 
large could not be expected by chance one time 
out of a hundred would indicate that a significant 
variable was operating. It may be safely as- 
sumed that in this instance the attitude of rejec- 
tion as measured by the SCT was the variable 
responsible for a rejection of the self evaluations 


316 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI 


TABLE XIV 


PEARSON PRODUCT-MOMENT CORRELATIONS BETWEEN SCT SCORES AND CRITERION 
MEASURES —SCHOOLS II AND ΠῚ 


Scale A Scale B Scale C 
Schools Schools Schools 
Evaluations II IH I II IH 
Part I 
Student Evaluations .45** .63** 328 .50** .98** .43** 
Administrator Evaluations .07 Ames ,28 .26 .10 .18 
Part II 
(Schools II and ΤΠ, combined) 
Student Evaluations .To** .41** .96** 
Administrator Evaluations οτε «AT «2395 
Part I 
Student Evaluations .60** . T2** .33** .63** . 45** FE: Gigi 
Administrator Evaluations TT . 4855. «26 .33 .26 «84 


ee | 
* School II, N = 51; School ΠΠ, N = 32 teachers. 
**Significant at .05 level or better. 


TABLE XV 


PEARSON PRODUCT-MOMENT CORRELATIONS BETWEEN SCT 
SCORES AND SELF EVALUATIONS—SCHOOLS I, II, III 


Parts N Scale À Scale B Scale C 
Part I πι .18 .16 .12 
Part III 71 .25* .20 .16 


*Significant at . 05 level. 


June, 1953) 


IH. Self Evaluati ici 
ations of Participating Teach 
for Schools I, II and III = men 


"nud indicates that a significant correla- 
το pe icient was obtained between the teach- 
πὶ ee of acceptance as measured by Part 
on e SCT and their own evaluations of their 
Gon ing effectiveness, Scale A. This observa- 
ΑΙ mean that there was a tendency for 
dnd eachers who rated themselves high to al- 
emonstrate a greater degree of acceptance. 
ieee results from Table XV must be interpre- 
demain as 33 of the 104 participating tea- 
tte id not complete the evaluations of them- 
coxa The significance of this factor can be 
en Tu understood by determining which teach- 
er erum to evaluate themselves. Itwas noted 
Who Po of the teachers, or 13 out of 57, 
dech ΕΟΝ scores on Part I of the SCT 
ons while 42.5 percent, or 20 out of 47, of 
Gaia Aiea had the lowest scores declined to eval- 
woul emselves. This difference of 19.5 percent 
sf Gp not have occurred by chance one time out 
enty. If this trend were projected on the 
Self evaluations, it could conceivably have im - 
proved the correlation coefficient on Scale A. 
" (s fact that a significantly greater number 
ant m teachers who demonstrated a rejecting 
e ude on the SCT also rejected this aspect of 
e ded would verify the rationale of this inves- 
Hee on. It will be recalled that one of the cri- 
€ ion findings noted in Section VII was the anx- 
Pad expressed by the participating teachers when 
y were asked to evaluate themselves. 
kan following conclusions can be drawn from 
the analysis of Table XVI And the findings as de- 
ribed in subdivisions I, IL, and M: 


in ü Even though the validation procedure used 
tes S study was a crude one, the consistency of 
cor obtained at Schools II and III where the 
that aoe were scored ‘‘blind’’ would indicate 
the ens SCT test is able to discriminate between 
9 ective and ineffective teachers. 
WE The relationship between the teacher's ; 
Section. attitude of acceptance and his teaching 
coms eee according to the students’ judgment 
ώς ted higher than would be expected b y 


chan 

3 
used The mean score of 6S on ihe SCT, when 
Over "5 a cutting score, was found to identify 
tage. Ni of the effective and ineffective 
ia teacher’s attitude of acceptance COT- 
Silke eee students’ judgment 
ron, teacher's attitude of confidence and res 
tende S students. Likewise, there isa 
i ncy for the accepting teacher to be relaxed 


In hi; a 
cites ih according to the opinion of the 


pect 
significant 


REED 317 


5. The administrators’ judgment of the teach- 
ers’ effectiveness, attitude of confidence and re- 
spect for his students, and the ease with which 
the teachers went about their job of teaching cor- 
related positively with the teachers’ attitude of 
acceptance, but generally insignificantly. 

6. The correlation coefficients at the three 
schools were sufficiently consistent to indicate: 

a. The scoring system established for the s 
SCT is a reliable technique. 

b. The three.samples of secondary school 
teachers are not unlike each other and 
are probably representative of the total 
population of active secondary school 
teachers on the personality dimension 
of acceptance. 

7. Those teachers who were the most accept- 
luated themselves higher than the 
ated a rejecting attitude. 
The accepting teachers are better able to accept 
a threatening situation, such as evaluating them- 
selves, than are those less accepting teachers. 

8. Scale A was consistently the one criterion 
measure that correlated best with the measured 
attitude of acceptance. Apparently the teacher's 
effectiveness is influenced more by an acceptance 
attitude than is the teacher's observed attitude of 


trust and confidence in the students, or the ease 
with which he goes about his teaching. 

9. It was indicated in Section VII that the sub- 
timulus phrase was important in the 


construction of sentence completion items. Of 
the original 91 items used in the SCT, 51 or 46 
percent used the first person, 30 percent were 

in third person masculine or feminine gender, 
and 14 percent were neuter gender or imperson^ 
alsituations. The final 26 items in Part ΠΠ of 
the SCT did not show the same proportion; 65 per- 
cent of the first person items, 27 percent of the 
third person, and 8 percent of the situational it- 


ems survived the elimination process. The in- 
ber of first person items in the 
final form would indicate that when the item is 

structured to elicit a response about the subject, 


itis more discriminating. This observation 
would tend to reinforce the thesis of this study. 


IV. Findings on Biographical pata 


This subdivision will attempt to break down 

f the criterion and predictive data and an- 
reference to sex, marital status, 
subjects taught, years of teaching 
and age. No attempt was made to 
discriminate between the teachers at the differ- 
ent schools. Separate Pearson product-moment 
correlation coefficients were computed for each 
group to determine the relationship between the 
teacher's attitude of acceptance and his teaching 


effectiveness. 


ing also eva 
teachers who demonstr: 


ject of the 8 


some Οἱ 
alyze it with 
dependents, 
experience, 


918 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI 


TABLE XVI 


SUMMARY OF CORRELATION COEFFICIENTS BETWEEN SCT SCORES 
AND CRITERION MEASURES 


Evaluations N Scale A Scale B Scale C 
Part I 
Student Evaluations 
School I 21 .85* «13. .43 
School II 51 .45* «28 .38* 
School III 32 .63* .50* . 43** 
Administrator Evaluations 
School I 21 .49** «15 .18 
School II 51 . 07 .23 .10 
School ΤΠ 32 .AT* .26 .18 
Self Evaluations τι .18 .16 «12 
Part I 
Student Evaluations 
School I 21 Dr .4b** .38 
Schools II and III 83 .2'D** 17 .23** 
Administrator Evaluations 
School I 21 .22 .16 .39 
School II and III 83 .2e* «ΑΠ «2533 
Part III 
Student Evaluations 
School I 21 . TO* .52** «41 
School II 51 . 60* .93** . 45* 
School III 32 > T2* «693 .AT* 
Administrator Evaluations 
School I 21 «831 «20 .834 
School II 51 alt «26 «26 
School III 32 .48* .33 «834 
Self Evaluations "1 «255. «20 «16 


* Significant at . 01 level. 
**Significant at . 05 level. 


June, 1953) 


1. Sex Differences 


- e correlation coefficients between the scores 
Mot e SCT and the student evaluations of their 
tu ctiveness show no significant differences be- 
E the male and female teachers. This would 
s expected considering the insignificant differ- 
s between their mean scores on the SCT and 
cc. evaluations, provided the students’ Scale 

= were a reliable measure of effectiveness (Sec- 
το IV showed a reliability coefficient of .88) and 
urther that the SCT were a reliable predictive 
measure (the investigator demonstrated the test’s 
reliability in Section VI). Therefore, it may be 
concluded that the coefficients of correlation noted 
in Table XVII are significant ones. 

It will be noted that the administrators evalu- 
ated the female teachers higher than the male 
teachers. The fact that the administrator eval- 
uations of the male teachers correlated higher 
with the SCT than did the evaluations of the fe- 
male teachers may be explained by the difference 
- rating, or the unreliability of the administra- 
Ors' evaluations, or both. It was noted from an 
examination of the correlation charts that a suf- 
ficient number of the teachers whom the admin- 
eus had scored as effective teachers scored 

ow on the SCT so that the correlation coefficient 
of . 059 for the female teachers showed no rela- 
tionship. This trend was also carried over to 
Part III on the SCT. 
κα... significant differe 
female teachers are 
evaluations and the SCT scores. The female 
teachers’ correlations were higher than the ad- 
ministrators’, The reverse was true of the male 
teachers. This trend can possibly be accounted 
for by the fact that the administrator and self 
evaluations for the male and female teachers were 
reversed, 
in The same trend noted in 
8 eri tendency of the rejec 
in to reject the self evaluations was apparent 
this analysis of sex differences. A higher 
, ercentage of both males and females who scored 
oe the rejection end of the SCT scale also 
ρος the self evaluations. The difference | 
λος the males was not significant; but the dif- 
= nce in percent, 28.3, for the females was 
Pee at the . 05 level. Apparently the self 
f uation project was more threatening to the 
emale teachers. 
aci a can be concluded that both se 
epus persons as measured on 
ent Pa correlated much better with the stud- 
did we aa of teacher effectiveness thanthey 
The c th the administrator or self evaluations. 
Frust ce with adniinistrator ratings were 
this t cantly higher for the male teachers, and 
rend is probably due to a tendency for the 


nces between the male 
to be noted on the self 


subdivision III regard- 
ting teacher on the 


xes are equally 
the SCT. The 


REED 319 


administrator evaluations to be less reliable for 
the female teachers. The same factor of unreli- 
ability is evident in the self evaluations, plus 
the influence of the 33 teachers who did not eval- 
uate themselves. 


2. Marital Status 


The student evaluations of the participating 
teachers show no significant differences between 
married and single teachers in the correlations 
with the SCT scores on acceptance, as noted in 
Table XVIH. The administrator and self evalu- 
ations, however, show significantly higher cor- 
relations with the SCT for the married teachers. 
Apparently the administrators were better able 
to evaluate the married teacher's effectiveness 
as it relates to acceptance than they wereto eval- 
uate the single teacher's effectiveness. In fact, 
the single teachers’ effectiveness shows no rela- 
tionship to acceptance, according to the adminis- 
trators and the teachers themselves. 

If the students’ evaluations show significant 
correlations with the teachers’ measured attitude 
of acceptance for both groups of teachers, and 
the administrator and self evaluations for mar- 
ried teachers are significant or near significant, 
it can mean only that the evaluations of the single 
teachers are unreliable or differ in some signif- 
icant way. The evidence on self evaluations may 
also indicate that the single teachers are less in- 
sightful concerning the problem under question 
in this investigation. It is also noted in Table 
XVIII that the single teachers evaluated them- 
selves significantly lower than did the marr ied 


teachers. 

The possible unreality of the single teachers 
may be explained by assuming that the tendency 
of the single teachers to reject the self evalua- 
tion phase of the project was present also inthose 


who accepted the self evaluations. None of the 
15 single teachers who scored toward the accept- 
ance side of the SCT failed to complete the self 
evaluations. However, 9 of the 16 teachers on 
the rejecting side of the SCT also rejected the 
self evaluations. Obviously this difference in 
percent between 0 and 56 would be a phenomenon 
that could not happen by chance short of infinity. 
Of the single teachers, 66 2/3 percent were fe- 
males; of the married teachers, only 44 percent. 
The fact that the female teachers were more re- 
jecting of the self evaluations may be responsible 
for the fact that the single teachers were more 
rejecting of the self evaluations. 

Not only did the students and the teachers 
themselves rate the single teachers lower, but 
the single teachers scored significantly lower on 
the SCT. The difference in mean scores of 4.27 
was significant at the .01 level, indicating that 
the single teachers were more rejecting. The 


920 JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE XVII 


SEX DIFFERENCES 


Pearson Product-Moment Scale A 
Correlations N Male N Female 
Part I of SCT and 
Student Evaluations 60 .54* 44 .63* 
Administrator Evaluations 60 «83. 44 . 06 
Self Evaluations 41 . 09 31 . 09 
Part III of SCT and 
Student Evaluations 60 68* 44 61* 
Administrator Evaluations 60 .46* 44 ολα 
Self Evaluations 41 . 21 31 .24 
Mean Scores 
SCT: PartI 60 46.1 44 45.4 
Part III 60 64.0 44 65.3 
Evaluations: Student 60 6.5 44 6.4 
Administrator 60 5.7 44 6.2 
Self 41 6.2 31 5.8 
*Significant at . 05 level or better. 
TABLE XVIII 
MARITAL STATUS 
Pearson Product-Moment Scale A 
Correlations N Married N Single 
Part I of SCT and 
Student Evaluations T2 .54* 31 66* 
Administrator Evaluations 72 .97* 31 = 4.19 
Self Evaluations 49 ails 22 - .08 
Part III of SCT and 
Student Evaluations .65* .T2* 
Administrator Evaluations .33* . 03 
Self Evaluations . 34* . 03 
Mean Scores 
SCT: PartI 72 46.6 31 44.1 
Part III 72 64.6 31 60.3 
Evaluations: Student πο 6.5 31 6.4 
Administrator 72 5.9 31 6.1 
Self 49 6.3 22 5.6 


*Significant at . 05 level or better. 


(Vol. XXI 


J 
une, 1953) REED 


TABLE XIX 


NUMBER OF DEPENDENTS 


Scale A 
Pearson Product-Moment Less than 2 2 or More 
Correlations N Dependents N Dependents 
Part I of SCT and 
Student Evaluations 62 .64* 40 .41* 
Administrator Evaluations 62 at 40 .44* 
Self Evaluations 62 18 30 - .01 
Part III of SCT and 
Student Evaluations 62 . 60* 40 .45* 
Administrator Evaluations 62 .20 40 .AT* 
Self Evaluations 40 .91 30 .16 
Mean Scores 
SCT: PartI 62 45.7 40 46.25 
Part Π 62 63.2 40 63.9 
Evaluations; Student 62 6.4 40 6.6 
Administrator 62 5.9 40 6.0 
Self 40 5.9 30 6.2 


*Significant at . 05 level or better. 


TABLE XX 
SUBJECTS TAUGHT 


Scale A 


Pearson product-Moment Non- 
Correlations N Academic N Academic 
Part I of SCT and 
Student Evaluations 60 .58* 43 .58* 
Administrator Evaluations 60 .24 43 .18 
Self Evaluations 40 .27 32 = 16 
Part III of SCT and 
Student Evaluations 60 x το 43 .15. 
Administrator Evaluations 60 i 28" 43 .34* 
Self Evaluations 40 .40 32 = .15 
Mean Scores 
SCT: PartI 60 46.3 43 45.5 
Part ΠΠ 60 64.0 49 62.5 
Evaluations: Student 62 6.5 43 6.4 
Administrator 60 6.0 43 5.8 
Self 40 6.0 32 6.2 


*Significant at . 05 level or better. 


322 JOURNAL OF EXPERIMENTAL EDUCATION 


fact that the administrators rated the single tea- 
chers higher would probably account for the neg- 
ative correlation of -.120 between Part I of the 
SCT and the administrators' evaluations of teach- 
ing effectiveness. 


3. Number of Dependents 


Some of the students in the field of personnel 
problems have called attention to the implication 
that persons with dependents and financial obliga- 
tions are more stable and hence better employ- 
ment risks. Table XIX reveals certain data on 
the teacher sample used in this study. 

An examination of the correlation coefficients 
in Table XIX would indicate that those tea chers 
with fewer dependents show a better relationship 
between their classroom effectiveness, accord- 
ing to the students, and their attitude of accept- 
ance than do those with two or more dependents. 
The administrator ratings would indicate the op- 
posite. Inasmuch as the mean scores ontheSCT 
Show no significant difference between the two 
groups, the other variable, evaluation of effect- 
iveness, is likely to be responsible. The differ- 
ence in mean scores, . 11, was not significant, 
and therefore the administrators' basis of dis- 
crimination between the two groups was apparent- 
ly different from that of the students. 

It cannot be concluded that either group is more 
accepting than the other, as evidenced by the 
mean scores on the SCT, nor is one group signif- 
icantly more effective. However, the students 
and the teachers themselves show a higher cor- 
relation for those teachers with fewer dependents, 
while the administrators provide significant cor- 
relations for those with more than two dependents. 

A larger percentage of those with fewer de- 
pendents rejected the self evaluations than did 
those with more dependents, 35.5 percent and 
29.5 percent respectively. Of those teachers 
with fewer than two dependents, a significantly 
greater percentage of the rejecting teachers on 
the SCT rejected the evaluations than did the ac- 
cepting teachers. The teachers with two or more 
dependents showed only a chance difference in 
percentage of rejection of the self evaluations. 


4. Subjects Taught 


An inspection of the differences between aca- 
demic and non-academic instructors from Table 
XX reveals no significant differences between 
the two groups on evaluations. However, onthe 
SCT the academic teachers scored significantly 
higher than the non-academic, where the differ- 
ence between the mean scores on Part III of 1.54 
was significant at the . 01 level. This would 
clearly indicate that the academic instructors 
were more accepting than the non-academic in- 
structors. This observation on evaluations and 


(Vol. XXI 


acceptance is contrary to a general opinion that 
the students have a tendency to think better of 
the non-academic teachers. 

The correlations between the two measures 
favor the non-academic teachers slightly. The 
self evaluations, however, definitely show a high- 
er correlation for the academic teachers with a 
significant correlation on Part III of the SCT and 
a near significant correlation for Part I. 

No difference was noted between the ‘‘accept~ 
ing” and ‘‘rejecting’’ non-academic instructors 
with reference to the tendency to reject the self 
evaluations. A near significant difference was 
noted for the academic instructors, where 25 per^ 
cent of the ‘‘accepting’’ teachers did not evalu- 
ate themselves and 42. 8 percent of the *reject- 
ing’’ teachers did not evaluate themselves. 


5. Age of Teachers 


An examination was made of the relationship 
between the age of the teachers and the ratings 
and between age and scores on the SCT, Part IH. 
Table XXI indicates a significant difference be^ 
tween the correlation coefficients of the evalua~ 
tions of the teachers and the evaluations by thé 
teachers, and the teachers’ scores on the SCT. 

It would seem that the significant and positive, 
correlation of . 245 between age and the teachers 
opinions of their own teaching effectiveness woul 
show that the older the teachers are the better 
teachers they feel they are. This would appear 
to be unrealistic according to the negative and 
near significant correlation for the students. The 
administrators, however, saw no connection ap^ 
parently between age and teaching effectiveness: 

The negative and significant correlation of 
-.256 between age and the SCT score shows that 
the older teachers are more rejecting in th eiT 
attitude. One would reason that the rejecting 
personality demonstrates some personality dis" " 
organization and therefore he is apt to be unre? 
istic; if this reasoning is valid, the students 876 
probably more accurate in their evaluation of y 
their teachers than the teachers are intheir eV? 
uations of themselves. h- 

A further check on the reliability of the teac 
ers’ evaluation of their teaching effectiveness f 
was possible from an analysis of the rejection 5 
the self evaluations. The older teachers tende — 
to reject the self evaluations more than the you" 
er teachers did, but the difference of 13 perc? 
was not significant. 5 

It can be concluded that (1) the older teacher 
are more rejecting in their attitudes on the 
(-.256); (2) the older teachers felt that they is 
more effective teachers (. 245); (3) the stude?" 
felt that the older teachers were less effective __ 
(4) there was no significant difference inthe pe 
centage of rejection of the self evaluations P? e 
tween the older and younger teachers; and 


were 


June, 1953) 
REED 
323 


TABLE XXI 


AGE OF TEACHERS 


Pearson Product-Moment 


Correlations Scale A 
Age and 
Student Evaluations - .19 
Administrator Evaluations - .07 
Self Evaluations .25** 
Age and 
SCT: Part ΠΠ = , 26** 
Mean Scores 
38.5 Years (Md = 35. 8) 
63.4 


Age 

SCT: Part W 
* N = 102 teachers. 
**Significant at . 05 level. 


TABLE XXI 
YEARS OF TEACHING EXPERIENCE* 
pearson Product-Moment 
Correlations Scale A 

years of teaching and 

Student Evaluations - .12 

Administrator Evaluations .01 

Self Evaluations .22** 
Age and i 

SCT: Part πα - wt 
Mean Scores 

Years of teaching experience 13.6 (Md = 10. 5) 

SCT: Part IH 
κ N = 102 teachers. 

t . 05 level. 


**Significant à 


924 JOURNAL OF EXPERIMENTAL EDUCATION 


rejecting person is more apt to be irrealistic. 
Therefore, this investigator is inclined to dis- 
credit the significant correlation between the tea- 
chers' concept of their own teaching effectiveness 
and their attitude of acceptance. 

A test was made with the correlation-ratio 
technique for a linear relationship between age 
and teaching effectiveness as rated by the stud- 
ents. It was found that a curvilinear relationship 
did exist, but it was not significant; both vari- 
ables were significant at less than the . 50 level. 
The student evaluation variable was only slightly 


more valuable than the age variable, .32 and.3 
respectively. 


6. Years of Teaching Experience 


It will be noted from Table XXII that the data 
on experience are approximately the same as on 
age, which could be expected. The same con- 
clusions are also applicable here. The only sig- 
nificant factor to be noted with reference to ex- 
perience is the fact that the regression line was 
more linear than it was for age. The experience 
variable was more valuable than the students’ 
evaluation of effectiveness variable, .24and .174 
respectively. 


SECTION IX 
SUMMARY AND CONCLUSIONS 


THIS SECTION vill present a summary 
of the investigation, conclusions, limitations of 
the findings, and implications for further 
research. 


I Summary 


From the investigator's several years of ex- 
perience as a public school teacher, as a super- 
visor of Air Force instructors and public school 
cadet teachers, as a personal counselor, as one 
given to introspection into his own behavior, as 
an interested observer of human behavior, andas 
a fortunate and grateful student of many leaders 
in the fields of education and psychology, the con- 
cept of acceptance has emerged in his thinkingas 
a meaningful dimension of the optimum personal- 
ity organization. It was desired to discover in 
this investigation any significant relationship that 
might exist between the subject's measured atti- 
tude of acceptance and his effectiveness as a sec- 
ondary school teacher. It was further hoped that 
the effectiveness of the teacher as measured on 
Scale A could be better understood throughan anal- 
ysis of the relationship between Scale A and the 
teacher’s acceptance of the student as measured 
by the raters on Scale B. The ease with which 


(Vol. XXI 


the teacher went about his teaching as measured 
on Scale C was also thought to be a contributing 
factor to teaching efficiency. 

The investigator obtained the cooperation of 
104 teachers in three secondary schools, and he 
obtained ratings on these 104 teachers and 56 m: 
ditional non-participating teachers from their Re 
ministrators and from an average of 62.39 stud- 
ents per teacher. The criterion evaluations were 
correlated against the 104 participating teachers 
measured attitude of acceptance on the SCT. 


I. Conclusions from Findings 


Some answers to the problems raised in pe 
tion I can be given from an interpretation of t 
findings. 


1. A relationship far beyond chance expectancy 
was found to exist between the teacher's effectiv 
ness in the classroom as evaluated by the stud- 
ents and that aspect of the teacher’s personality 
organization, or attitude, which permits him tO — 
be an accepting person (Tables XIII - XVI). 1148 
ing only from the sample of participating teach- 
ers used in this investigation, it would be poss* a 
ble to predict with a fair degree of accuracy {ro™ 
the SCT scores the teacher whom the students 
would feel to be the more effective teacher. I - 
would not be possible to predict safely which tea 
cher the administrator would judge to be effec" 
tive. 

If the students learn best from the effective ᾱ- 
teacher as defined in this investigation, the stug 
ents will also learn best from the accepting ler 
cher as defined and measured by the SCT. Th 
effective teacher is also the teacher whom thé 
Students feel trusts them and has confidence pe 
them, and who also seems to teach with ease # 

a sense of humor. (Inter-scale correlations 
ranged from .66 to . 76.) ident 

It was found that the SCT could correctly ! ed 
ify the effective and ineffective teachers as pon 
by the students in better than 75 percent of t 
cases. 

2. It was found that the SCT was a reliable | 
measure of the teacher’s attitude of acceptan tig 
Two other scorers agreed with the investigat? ter“ 
scoring of ten randomly selected tests. The =) 
scorer reliability for the two parts of the sc " 
ranged from .85 to . 95 (Table ΠῚ). Five pom 
Scorers and the investigator were able to dif 
entiate between an acceptance and a rejection . 
sponse in 391 sample responses out of 445; spon 
87.9 percent. In only 12.1 percent of the re. 
Ses did one or more of the scorers disagree mo 
the others as to whether the response was de 
strating acceptance or rejection (Table ID. | pif“ 

The self evaluations were found to be 2 51 
icant measure of the attitude of acceptance: ~~ 
the 33 teachers who declined to evaluate thé 


Tune, 1953) 


eet a significant number were found to dem- 
ae e a rejection attitude on the SCT. The 
ο... on the SCT for those who evaluated 
SEE Mes was significantly higher than the mean 
3 ns those who did not evaluate themselves. 
ei ra . found that the students' evaluations 
th coe eachers’ effectiveness was aT eliable 
à κα for purposes of this investigation, with 
tios iene coefficient of .88. The critical ra- 
the diff een the mean scores of the students at 
igang erent schools generally showed no signif- 
^ differences (Table VIII). 
vase, Es coefficients of reliability between the 
tiene of the students, administrators, and par- 
but in ing teachers showed generally a positive 
sieur shifoni relationship (Table IV). The 
er agr S and administrators demonstrated a clos- 
the s HE than did the teachers with e ither 
each e or the administrators. Apparently 
from a e of raters was evaluating the teachers 
ratios ifferent point of reference. The critical 
Po Showing the reliability of the difference be- 
incer evaluation scores (Table VII), the 
dian ο ater reliability (Table IV), and the correl- 
ined oefficients with the SCT (Table XVI) all 
δες eee the lack of strong agreement be- 
The e different types of raters. 
demon consistency of each class of raters was 
for τ ated in that the rank order of the schools 
tained students and administrators was main- 
5ο οὓς on all three scales (Table IX). It was al- 
Hogue s that the students’ mean evaluation 
minist of their teachers were highest and thead- 
Side lowest, with the teachers’ self 
Fa ions in between (Table ΥΠ). 
να μα trend was noted for the students to 
non- te the participating teachers higher than the 
mig ΡΘΕ teachers (Table ΥΠ). The ad- 
Welited D eth on the other hand, consistently re- 
ES this trend. The same trend was noted for 
Sely teachers who declined to evaluate them- 
es (Table XII). 


he students were, according to paragraphs 
luate their teach- 


ers» 
titude ffectiveness in terms of the acceptance at 
ma than were the administrators. This factor 


ij been responsible for higher correla- 
ων the students' evaluations of effec" 
than bor and acceptance as measured by theSCT 
ac etween the administrators’ evaluations and 
[νοκ 
rend t tenth grade students showed a marked 
to eee d higher ratings and the eleventh grade 
Tated th ratings (Table X). The male students 
stude, e teachers the same as did the female 
Accepted on Scales A and C; but the boys feel less 
Y a si and trusted than the girls, as indicated 
XI). gnificant critical ratio on Scale B (Table 


tio 


No: 
ne of the above conclusions on the differ- 


ence 
S noted between raters can be interpreted to 


REED 325 


mean that one class of raters is more able than 
the others. The differences mean only that for 
purposes of this investigation the student evalua- 
tions yielded more meaningful results than the 
other evaluations. 

5. It was not possible to determine the relia- 
bility of the self evaluations, as 33 of the 104 tea- 
chers did not complete all three scales. The self 
evaluations provided meaningful data relative to 
the rationale of the predictive measure, butthey 
were of no significant value as criterion meas- 
ures, except that there is reason to believe that 
teachers refusing to evaluate themselves are, on 
the whole, less effective teachers. 

6. The biographical data provided some mean- 
ingful relationships with the acceptance attitude 
and with some of the criterion measures. 


The attitude of acceptance seems te be ident- 
ical for both male and female teachers (Table 
XVII), and for those with less than two depend- 
ents (Table XIX) and more than one dependent. 

A positive and significant trend was noted for the 
mean scores on the SCT to be higher for the mar- 
ried teachers (Table XVII), and for those who 
teach academic subjects (Table XXII). The young- 
er teachers (Table XXI) and those with less than 
the mean number of years of teaching experience 
higher on the SCT, 


(Table XXII) tended to score 
as evidenced by the negative correlations between 


age and acceptance. 
An analysis of the mean scores on teacher ef- 


fectiveness as evaluated by the different raters 
showed some significant trends. The students’ 
and self evaluations showed no distinction between 
the male and female teachers, but the adminstra- 
tors showed a preference for the male teachers 
(Table XVII. No difference was apparent be - 
tween the raters in the evaluation of the married 
and single teachers, aS all three groups r ated 
the married teachers higher (Table XVII). The 
students and administrators agreed that the tea- 
chers with more than one dependent were the bet- 
ter teachers, but the teachers themselves f elt 


otherwise (Table XIX). The students and teach- 
demic teachers higher than the 


ers rated the aca 
non-academic teachers, while the administrators 


made no distinction (Table XX). 

be safely concluded that this investiga- 
f hundreds of similar attempts 
to add some knowledge to our understanding of 
the effective teacher. The educational process 
cannot be improved through the efforts of educa- 
tors, psychologists, and philosophers working 
alone. This study may be an insignificant and in- 
tangible contribution for the common good, but 
it is hoped that this research effort may be as 
helpful to others working in the area of teaching 
effectiveness as the many studies examined have 
been helpful to this investigator. 


It may 
tion is only one [) 


MEASURING KNOWLEDGE AND APPLICA- 
TION: AN EXPERIMENTAL 
INVESTIGATION 


DONALD E. SMITH 
MARVIN D. GLOCK 
Cornell University 


κ... OF knowledge, either for 
'..'. of an important problem or for the 
the stud nding which enriches life, is the goal of 
can om ent. To what extent classroom learning 
sporad applied elsewhere, although investigated 
ently md over the past thirty years, has rec- 
of kno en brought into focus. Does possession 
icd in imply ability to use that knowledge? 
But Du evidence suggests a negative answer. 
pve results of experimental investigations are 
ri eie Gy 8, 3, 10, and 11). 
that he assumption that the answer is negative, 
haie R of facts and principles does not 
assum he ability to use them, and on the further 
adequ paon that current achievement tests do not 
ests raid consider this outcome of learning, 
it. Th ve been devised which purport to measure 
tes e authors have found, however, only two 
ue with this objective in the literature: the 
yH Study tests in human growth and development 
τ Αμ νων and Troyer (4,5), and 2 similar in- 
of Pens concerned with the developmental needs 
Cas olescent girls by Sara ‘Ann Brown (2). The 
ti study tests seem to be most valuable for pur- 
a of class discussion. Criticism has been 
ο M at the method of scoring which consists 
" MeN weightings determined by the amount 
ieee en among experts on the correct re- 
that ec Information on the second instrument, 
Guat ee developmental needs, is inade- 
e for an appraisal. 


It is suggested that, in addition to 
hievement tests of appli- 


tive he authors attempted to construct an objec- 
iG LONE answer test which measured adequately 
an ος to apply facts and principles learned in 
havea ΕΥ college-level course in genera’ 
ative t ogy. They wished to obtain more data rel- 
applic o the relationship between knowledge and 
ering ce on. A final examination adequately Cov- 
eral ρον. material of a one terni course in gen- 
Psychology was constructed in two parts. 


Part I, consisting of eighty multiple-choice ques~ 
tions based upon reading selections, was designed 
specifically to measure application of content. 
Two of the selections were written by the authors; 
three were adapted from materials which were 
probably unfamiliar to the students. From ten 

to thirty questions immediately follow each selec- 
tion. Part Il, consisting of seventy-nine questions, 
is of traditional design and meagures, primarily, 


knowledge of facts and principles. There is little 
overlapping of specific subject matter althoughall 


course conten 
Care was exercised so that items were not over- 
lapping and one did not contain information toaid 


the testee in answering another. 


Part I is composed of five reading selections, 
four of which are below the ninth grade level of 
reading difficulty, the fifth similar to the text- 
ual material read throughout the term. The av- 
erage length is about four hundred words. Three 
are largely conversational, picturing situations 
familiar to the students, 6.6. studying for an 
examination in a dormitory room. Two discuss 
talinvestigations: one is 


results of experimen! 
written for laymen; the other is sonrewhat more 


technical. 
Whereas Horrocks and Troyer tested diagnosis 
and remediation of hypothetical problems for ἃ 

i wth and development, the 
authors view application of course content for 
elementary psychology in a somewhat different 
light. When one attempts to use his knowledge 
of behavior in a situation, he asks himself, «What 
is happening? Why is it happening? What prin- 
ciples are involved?” Questions illustrative of 
this pattern of application were constructed: 


What kind of reaction to her lack of 
study is indicated by Susan’s statement, 
«1 can't understand a word in that book: ”’ 


(1) logical; (2) repressive; (3) projective; 
(4) habitual μυ... 


328 JOURNAL OF EXPERIMENTAL EDUCATION 


(Differentiating ratio: . 43)1 


In reference to the following question, the room- 
mate has remarked that the examination would 
be a ‘‘lead pipe."' Susan mentally added ‘‘cinch”’ 
to complete the metaphor: 


What is illustrated by Susan's addition 
of **cinch" to ‘‘lead pipe" resulting in 
the expression, ‘‘lead pipe cinch’’? 


(1) rational elaboration of ideas; (2) Mul- 
ler-Lyer principle; (3) principle of con- 


tinuity; (4) principle of closure 
(Differentiating ratio: . 16) 


Lashley's experiment on retention of habitsafter 
losses of cerebral tissue is the subject of the fol- 
lowing. ‘‘Rats were trained first, then operated 
upon; then retention was determined by measur- 
ing the amount of practice required to relearn 
the maze perfectly." (Excerpt from reading sel- 
ection. ) 


How was retention of a habit determined? 
(1) recall; (2) delayed reaction; (3) sav- 
ing method; (4) error count 


(Differentiating ratio: . 28) 


It will be apparent from the foregoing examples 
that the student is required to test a limited num- 
ber of solutions to the problem. In a real situa- 
tion, the number of possible solutions is usually 
much greater. But there is required here not 
only an understanding of course material, but al- 
So an ability to recast this material in the form 
of hypotheses for the solution of a variety of prob- 
lems. 

Preliminary validation of Part I was deter- 
mined by its coverage, the apparent validity of 
each item based upon its objective of measuring 
application, and the criticisms of the writers’ 
colleagues. The eighty items were chosenon this 
basis from an original group of one hundred and 
twenty-five. Part Π, a test of knowledge, was 
constructed in a like manner. The coefficient ot 


(Vol. XXI 


reliability, based upon a population of one hundred 
thirty-three students, was determined by the spli 
half method. The Spearman-Brown formula ap- 
plied to the half-test coefficient yielded an esti- 
mated reliability of . 828. 


Results 


The total test was administered as a final ex- 
amination at one sitting to a class of one hundred 
and sixty-five students from the College of Agri- 
culture and Home Economics at Cornel Univer- 
sity. The two-hour limit was sufficient for all to 
complete the examination. K: 

Validation of the test was approached indirect” 
ly in the absence of a suitable criterion. Produc 
moment correlations with various measures of 
aptitude and achievement are presented in Table 
I. The relationship between term average and 
Part I, r = .693, when compared with that for 
Part II is rather surprising in light of the objec” 
tive of the first part. We might have expected 4 
somewhat lower coefficient if we are, indeed; 
measuring something different from knowledge. 
This will be clarified below. On the basis of 60” 
efficients from .10 to . 25 between the case study 
tests and the Ohio State, Horrocks (4) concludes _ 
that **.... given a basically good intelligence, 2 
ded increments of intelligence in the superior 
range do not add to ability to succeed onthe test 
in question. "2 It may be questioned whether the 
Ohio State measures *'intelligence'' per se. It 
seems to measure reading and vocabulary. It 
does seem significant, however, that Part I is " 
less closely related to this test of academic apti 
tude, r - .390, than is Part II, r = .520; and ne, 
same may be said of the Cooperative Science TO 
(986 and . 483). This is especially so since d 
I correlates significantly better than does ἘΝ 
with both chemistry (. 280 and . 012) and qmd 
(.498 and .388) grades. We may conclude came A 
these results that Part I measures something ie 
essary for success in chemistry and botany p 
than rote learning, vocabulary or reading." ^ g 
CEEB, V and M score coefficients are not ye 
icant. This may, perhaps, be attributed to a 
greater complexity of the factors measured i 


1. The differentiating ratio is a simple device for dete 


item. 


2. Yet Troyer (5) states, concerning the self-same test, " 
noStic and remedial items from Part I, but as the data 


the discriminating value of an 
It is the quotient of the number in the σκι 20 un : t 
item correctly divided by the number in the highest 272 who 


the population who answer the 
answer correctly. 


eeeeSome students do very well in diag- 


become more complex and te, the; 
ño less well." This is reflected also in the present 1 - που 


nstrument, 


3. Admittedly, this factor of ability to apply, as it is reflected here, is Small. But, as will 
be seen later, size of a factor in a Study of this type does not neces τ 


portance. 


Observation of the effort and close concentration exerted by the stude 
trasted with the comparative ease on Part II suges 
principles plays an important part in applicatione 


sarily indicate its im- 


nts on Part I as con- 


ts that ability to generalize and integrate 


June, 1953) . 
SMITH - GLOCK 329 


TABLEI 


RELATIONSHIPS WITH APTITUDE AND ACHIEVEMENT OF AN ACHIEVEMENT 
TEST IN GENERAL PSYCHOLOGY 


Term Average in Course 

(exclusive of final test) 165 .693* . 650* 
Ohio State University Psycho- 

logical Test 92 .390* ë 520. 
Cooperative Science Test 92 .388* . 483 
Introductory Chemistry Grades 113 .280* . 012 
Botany Grades 101 .498* .388* 
College Entrance Examination 

Board - V Score 38 .204 .146 
Coll trance Examination 
scher 38 -. 089 . 064 


Board - M Score 


*Significant at the . 01 level. 


330 JOURNAL OF EXPERIMENTAL EDUCATION 


this test. Ability in neither the verbal nor the 
mathematical areas is a guarantee of success in 
academic pursuits. 

For further validation of Part I, the following 
reasoning was used. Knowledge, as measured 
by Part II, does not assure ability to use that 
knowledge as measured by Part I. We should ex- 
pect, therefore, that those with the highestgrades 
on Part II might not receive the highest grades on 
PartI. Thus, the highest 20% on Part II scored, 
at the mean, 4.5 points lower on Part I than did 
the top 2075 on PartI. The CR of the difference 
is 4.1 which is highly significant. 

A final method of validating Part I was an anal- 
ysis of its correlation with Part II, r = . 680.04. 
Correction for attenuation yields a coefficient of 
.818 + . 03 which seems to indicate that Part I 
and Part II are measuring, to a large extent, 
though not entirely, the same factors. Horrocks 
and Troyer (5) concluded, on the basis of inter- 
correlations among the case study tests, that 
**application of knowledge to each case tends to 
be highly unique." Considering the somewhat 
low reliabilities of the case study tests, their ob- 
tained intercorrelations may not reflect the true 
relationships. Corrected for attenuation, those 
coefficients, .55, .39 and . 62, become . 72, .50 
and .84. Considering the breadth of knowledge 
required by differing emphases on the three tests, 
these appear to be relatively high coefficients. 
Howard (6), in a similar analysis of the relation- 
Ship between knowledge and ability to use that 
knowledge, obtained a factor loading of . 11 which 
he assumed was the factor of item complexity, 
i.e., ability to apply knowledge. The remaining 
factors were identified as content. He concluded 
that, within the limits of his study (college level 
Science), possession of knowledge is a sufficient 
guarantee of ability to use it. 

We are now in a position to examine further 
the conflicting evidence concerning the relation- 
Ship in question. In each of the studies cited " 
Horrocks and Troyer, Howard, and the present 
investigation, as well as those mentioned in the 
introductory remarks, there appears a small fac- 
tor which is variously labeled intelligence, high- 
er mental processes, generalization and abstrac- 
tion, inference, ability to apply principles, and 
ability to think ina field. It is nearly always a 
small factor which tends either to be overlooked 
or to be magnified. Itis small, presumably, be- 
cause the largest single factor in the correlation 
studies must be knowledge of course content. 


This is the sine qua non of application. Onemust 


(Vol. XXI 


have the knowledge in order to use it. The fac- 
tor of application is difficult to isolate for this 
reason, but it is present in every study. Thatit 
is, in reality, highly important for success in 8 
field we know empirically. 


Conclusions 


Despite conflicting evidence on the re lation” 
ship between possession of knowledge and appli- 
cation of it, tests purporting to measure applica- 
tion have been constructed. Since we must con~ 
sider application of class-learned knowledge a 
principle objective of education, there was de^ 
vised for a college course in general psychology 
an achievement test which was meant to test, ΟΠ 
different parts, application and knowledge. The 
parts consisted, respectively, of eighty and Sev" 
enty-nine multiple-choice questions, the first 
eighty based upon reading selections. Both τε” 
liability and validity appeared satisfactory. 

Part I, designed to measure application, COT" 
relates most highly with term averages, chem- 
istry grades and botany grades. Part II correl- 
ates most highly with general and scientific apti” 
tude tests. It is conluded that Part I satisfactor" 
ily measures something necessary for success 
which is adequately measured neither by aptitude 
tests nor by traditional subject matter achieve- 
ment tests. How it is related to certain higher 
order mental processes has not been determine®- 

Since possession of knowledge of facts and wit 
principles, per se, is necessary before that a Ρ 
edge can be applied, the factor of application, 
Spite its practical importance, usually appears 
small or negligible in correlation and factor a! 
alysis studies. It is suggested that this, in ad- 
dition to weaknesses in the studies themselves; | 
is the reason for the conflicting results of inve? 
tigations in the relationship between knowledge 
and application. . 

It should be noted, finally, that application 
may not be a spontaneous act. There is ample 
evidence that amount of transfer of previous le 
ing is influenced by amount and kind of training: t 
Judd, furthermore, has stated that ‘‘.... them | 
effective use of knowledge is assured not throug 
the acquisition of any particular item of exper _ 
ience but only through the establishment of 359 
ciations which illuminate and expand an item 0 
experience so that it has general value. ’’4 T ing 
teacher is responsible, then, not only for testi” 
effective application but also for assisting Stu 
ents to develop that ability. 


rn^ 


———————————————————————9ÀÀÉ πο 


Πο C. He Judd, Educational Psycholo quoted in James B. Stroud's Psychology in Education (New 
York: Longmans, Green and Co., 1916), p. 592. 


June, 1953) 


SMITH - GLOCK 331 


BIBLIOGRAPHY 


. Atkins, Dorothy C. Construction and Anal- 
ysis of Achievement Tests (Washington, 
D.C.: U. S. Government Printing Office, 
1941). 


- Brown, Sara Ann. ‘‘Technique for Evaluat- 
ing the Ability of Teachers to Apply Prin- 
ciples Concerned with the Developmental 
Needs of Adolescent Girls, "' Journal of Ed- 


neational Psychology, XLI (1950), pp. 481- 
87. 


- Garrett, Henry E. Statistics in Psychology 


and Education (New York: Longmans, Green 
and Co., 1941). 


- Horrocks, John E. ‘‘The Relationship B€- 
tween Knowledge of Human Development 
and Ability to Use Such Knowledge, " Jour- 


nal of Applied Psychology, XXX (1946), pp. 
501-508. 


- Horrocks, John E., and Troyer, Maurice E. 
“Case Study Tests of Ability to Use Knowl- 
edge of Human Growth and Development, ”’ 


Educational and Psychological Measure- 


ment, VII (1947), PP- 23-26. 


πα Tyler, R. W. Const 


6. Howard, Frederick T. Complexity of Mental 
Processes in Science Testing, Contributions 
to Education, No. 879 (New York: Teach- 
ers College, Columbia University, 1943). 


7. Stroud, James B. Psychology in Education 


(New York: Longmans, Green and Co., 
1946), p. 592. 


^g. Tilton, J. W. The Relation Between Associ- 
ation and the Higher Mental Processes 
Contributions to Education, No. 218 (New 
York: Teachers College, Columbia Univer- 


sity, 1926). 


ructing Achievement 
Ohio: Bureau of Educa- 


Tests (Columbia, 
Ohio State University, 


tional Research, 
1934). 


10. Tyler, R 
the Higher Mental Proces 


The Macmillan Co. , 1936), 


. W. Education as Cultivation of 
ses (New York: 


Ch. W. 


11. Wood, Ben D. Measurement in Higher Edu- 
cation (Yonkers, New York: World Book 


Co., 1923). 


A 
«X 


hài 
€ 


AN INVERTED FACTOR ANALYSI 
SS 
STUDENT-RATED INTRODUCTORY » = 
PSYCHOLOGY INSTRUCTORS 


A. W. BENDIG 
University of Pittsburgh 


- ος PROBLEM of defining the important 
iene ze characteristics of teachers has a 
rod Somewhat fruitless history. Many at- 
differe ave been made to isolate the essential 
chera ρος between effective and ineffective tea- 
Selecti ypically for the future construction of 
cate on instruments. Autobiographies of gen- 
a tere great teachers, descriptions 
teache y students about their best-remembered 
poor et quantitative test records of good and 
been ae and a host of other devices have 
ant κών - This has been a particularly import- 
Seconda em in the selection of elementary and 
review ry School teachers as Beecher's recent 
Proven indicates (1). None of the attempts has 
depend VETY fruitful. Schools of education still 
Pective m on the personal evaluation of pros- 
erview eachers by more or less experienced in- 
lewers, 
eds attention has been devoted to the char- 
mass of cs of college teachers in contrast to the 
Schoo] | research on elementary and secondary 
Suggests ee A recent government report (7) 
Οἱ colle that the main objectives in the training 
Petent £e teachers is the development of com- 
ittle a δα and research workers and that 
i skis σα is paid to the development of teach- 
chan’ method of defining important teaching 
Liveries Gee may prove to be the so-c alled 
Series factor or Q-technique of factor analysis. 
of teach of quantitative measurements ona group 
ers may be intercorrelated and the re- 
deriy, id quatrix of correlations factor analyzedto 
quately ee of independent factors that can ade- 
e Sereni the intercorrelations between 
factors pS, Behavorial descriptions of these 
Predict? then help us in devising independent 
Variables, measures of these important teaching 
Dlicare Present study was concerned with the ap- 
to κας of inverted factor analysis techniques 
Psychoine group of student ratings of introductory 
Show, ,98Y instructors. It has been previously 
difference t student ratings can reflect individual 
fro E es between instructors (3). Whether 
Se same ratings significant and independ- 


ent constellations of teaching behaviors can be 
determined was investigated in this research. 


Procedure 


Ten introductory psychology instructors were 
rated by their undergraduate students at the end 
of the fall semester, 1950-51. Each student rat- 
ed his instructor on the fourteen five-choice rat- 
ing scales developed and described by Crannell (4). 
The scales cover many different facets of instruc- 
tor personality, such as organization of course 
material, friendliness toward the students, per- 
sonal appearance, etc. A total of 490 students 
participated in the ratings with individual instruc- 
tors being rated by from 12 to 90 students. Fur- 
ther information on the instructors, students, and 
scales can be found in a previous report (3), This 
previous paper also contains the obtained means 
and standard deviations of each of the fourteen 
scales for 490 daytime students in addition to 
similar information for evening sections of intro- 
ductory psychology. 

The mean rating of each of the ten instructors 
on each of the fourteen scales was'computed and 
the raw score deviation of the mean from the 
mean of the scale determined. This deviation 
was given a plus sign if the deviation was toward 
the low (favorable) end of the scale and a negative 
sign if toward the high (unfavorable) end. Since 
individual scales differed in their variability, the 
fourteen deviation scores for each instructor were 
divided by the standard deviations of the scales. 
The basic data on each instructor was a profile 
of fourteen standard scores indicating his posi- 
tion on each scale as being above or below the 
mean of the group and the amount of his deviation 
in standard score units. 

The profile scores of each instructor were 
then correlated (product-moment) with the pro- 
files of each of the other instructors and the cor- 
relational matrix shown in Table I determined 

This matrix of intercorrelations between in- 
structors was analyzed by standard Thurstone 
centroid techniques with the slight variation that 
in reflecting signs in the original and residual 
matrices the criterion used was to maximize the 


334 JOURNAL OF EXPERIMENTAL EDUCATION 


algebraic sums of the columns of correlations 
rather than to minimize the number of negative 
signs. This variation has been used by Michael, 
Zimmerman, and Guilford (9). On the first an- 
alysis of the matrix the highest correlation in 
each column was used as the communality esti- 
mate in the diagonal of the table. This practice 
was also followed for each of the residual ma- 
trices as suggested by Thomson (10). Analysis 
was discontinued after the extraction of the third 
factor upon the application of Tucker's criterion 
(11). New communality estimates were made on 
the basis of the three extracted factors, these 
new estimates inserted in the original matrix, 
and the matrix again analyzed. The process was 
repeated a third time and the median absolute 
difference between communality estimates based 
on the second and third analyses found to be. 025. 
On the basis of this small variation iteration of 
the procedure was stopped. 

The problem of rotating the factor axes to a 
**psychologically meaningful" position presented 
problems. Pairs of the three extracted factors 
were graphed as the distribution of the ten instruc- 
tors in two-dimensional factor space. The cri- 
terion adopted was to rotate the axes so as to 
maximize the separation of the instructors into 
two or three groups on each factor. The first 
factor extracted adequately met this criterion, 
the second and third were rotated to meet this 
requirement. The original factor loadings for 
all three factors and the rotated values for Fac- 
tors II and III can be found in Table II. 

The problem of naming or describing the ob- 
tained factors in a Q-technique study presents 
some difficulties for which severalsolutions have 
been used. Guilford and Holley (5) computed the 
product of the factor loading of each individual by 
the rating given by that individual to each of the 
test objects. The sums of the products for each 
object determined the rank order of the objects 
for a given factor and from this rank order the 
authors derive a verbal description of the factor. 
Bendig (2) used the factor loadings of the individ- 
uals on each factor to graph the individuals on a 
linear scale and asked judges who knew the sub- 
jects very well to write a description of a person- 
ality characteristic that would result in the ob - 
tained ordering of the subjects. Factor descrip- 
tions were abstracted on the basis of common 
phrases in the written descriptions. The method 
used in the present study is similar to that used 
by Holley and Buxton (6). In their study factor 
loadings were correlated by biserial correlation 
with the responses of the subjects to each true- 
false item on a test of beliefs. Items correlating 
highest with the factor loadings were used to de- 
scribe the factors. 

In the present study the factor loadings of the 
instructors on each factor were separately cor- 
related (product-moment) with their standard 


(Vol. XXI 


Scores on each of the fourteen rating scales. A 
positive correlation indicated that instructors 
with high positive loadings on a particular factor 
tended to get favorable ratings on the single scale 
involved and negative correlations the reverse. 
The three rating scales showing the largest abso- 
lute correlations (regardless of the algebraic 
sign) were used to derive a description of eac h 
factor. The wording of the factor descriptions 
was taken from the wording of the correlated 
Scales and no abstraction of verbal content was 
attempted. The following are the descriptions of 
the characteristics of instructors on the positive 
and on the negative extremes of each factor. The 
scales and their correlations with the factor load- 
ings are given after each factor heading. 


Factor I. Scales 2, 6, 10 (-.69, -.66, -. 79) 

Content of his classroom presentation is some" 
times dull and uninteresting. Usually keeps fair 
control of the class, but sometimes lets students 
sidetrack him. Usually shows some sense of hum^ 
or in class. 

Content of his classroom presentation is fre- 
quently quite interesting and seldom is dull. Al- 
ways keeps things moving smoothly and seldom 
loses control of the class. Has an exceptionally 
good sense of humor. 


Factor IL Scales 1, 4, 7 (.37, -.36, -.38) 

Course material is very well organized. Has 
a fairly friendly attitude toward students, b ut 
sometimes is variable in his attitude. Usually 
is reasonable as to length of assignments, but 
sometimes is unreasonable. 

Part of the course material is organized, but 
most of it is loosely organized and becomes in^ 
definite and confusing. Has an exceptionally 
friendly attitude toward students. Always very 
fair and reasonable toward the length of assign 
ments. 


Factor II, Scales 5, 8, 12 (-. 44, .45, -. 57) 
Occasionally recognizes student effort, b ut 
sometimes appears indifferent. His examina ws 
tions are usually quite fair and reasonable. sho ig 
few annoying mannerisms in class, but is not un 
usually attractive in appearance. tud" 
Exceptionally appreciative attitude toward S 
ent effort and encourages it. His examination? _ 
are sometimes unfair. Has an unusually attra 
tive appearance. 


To validate the above factor descriptions του" 
members of the departmental faculty who know 
the instructors quite well were given the p@ i P 
of factor descriptions and asked to rank or9@ 
the ten instructors along each scale from the ar 
Structor best described by one of the pair of La 
agraphs to the instructor best described by f Gur 
other paragraph. Their rankings of the inst” 


, 


June, 1953) 
BENDIG - 


TABLE I 


CORRELATIONS (PRODUCT-MOMENT) BETWEEN STANDARD SCORE PROFILES OF TEN 
INSTRUCTORS ON STUDENT RATING SCALES 


Instructor B C D E F G H I I 
A -.26 -.08 .59 — -.37 22 -.16 .09  -.29 -.39 
B -.44 -.38 31  -.57 .40 --46 44 53 
ς 55 --49 65 --45 08 | -.33  -.45 
D -.33 .18 — -.27 11. 05 -. 41 
E -.54 25  -.07 49 28 
F -.60  -.06  -.96 -.46 
G -.36 «41 21 
H -37  -.91 
I . 03 

TABLE Π 


TORS DETERMINED FROM CORREL- 


FACTOR LOADINGS OF TEN INSTRUC 
ING SCALE PROFILES 


ATIONS BETWEEN STUDENT RAT 


Original Rotated 
Instructor I II IH π n h? 
-.51 «30 .68 .81 


.T3 .18 .04 .57 
.6T .23 -.56 .82 
.56 .41 .17 BÀ 


.16 ο --42 5 


.62 .33 .16 .52 


Percent of 
total 37 11 10 37 10 11 58 
Variance | 


336 JOURNAL OF EXPERIMENTAL EDUCATION 


tors on each factor were correlated (rank-differ- 
ence rho) with the rankings of the instructors on 
each factor as determined by their loadings. The 
median validity coefficient for each scale was com- 
puted and the inter-judge agreement among the 
four judges was determined by Kendall's ‘‘coef- 
ficient of concordance” (8, p. 80ff). The med- 
jan validity coefficient and the coefficient of con- 
cordance for each factor was: Factor I, .50 and 
-77; Factor II, . 49 and . 56; and Factor III, . 41 
and .53. 


Discussion 


The pairs of factor descriptions of introduc- 
tory psychology instructors reported above would 
Seem to be the important point of this study. The 
pictures of important (to the student) constella- 
tions of behavioral characteristics of instructors 
are not those of the conventional ‘‘good teacher," 
but they sound suspiciously like descriptions of 
human beings. Seemingly mutually contradictory 
characteristics are combined in the descriptions, 
yet an image of teachers we all have known be- 
gins to appear from the synthesis of these traits 
within a single factor description. Nor do thede- 
Scriptions lack validity when objectively compared 
to the evaluations of judges who are well acquaint- 
ed with the personal characteristics of the in- 
Structors involved. No attempt can be made to 


(Vol. XXI 


evaluate which end of the factor continua are 
**good"' or ‘‘bad’’ traits ofa teacher; that depends 
upon the rational judgment of a judge or upon the 
empirical relationship between these continua and 
some outside criterion of teaching effectiveness, 
such as the amount of course content that is ab- 
sorbed by the students. The reliability or stabil- 
ity of the factor from course to course, from 
discipline to discipline, or from schoolto school 
is, of course, unknown. Nor is it known whether 
similar factor descriptions would be found if a 
different initial set of rating scales were used. 


Summary 


The student rating scale profiles of ten intro- 
ductory psychology instructors were correlated 
and the matrix of intercorrelations was factor 
analyzed by inverted factor techniques. Three 
factors were extracted, two of which were rotat- 
ed to maximize clustering of instructors into 
groups. The factor loadings of the instructors 
were correlated with their scores on each of the 
fourteen rating scales and the three scales cor- 
relating highest with each scale used to describe 
the extremes of each linear factor. Validity of 
the factor descriptions was determined by correl- 
ating the factor loadings of the instructors with 
the rankings of the instructors on the three fac- 
tors by four competent judges. The median val- 
idity was .49. 


BIBLIOGRAPHY 


1. Beecher, D. E. The Evaluation of Teaching: 


Backgrounds and Concepts (Syracuse, N.Y.: 
Syracuse University Press, 1949). 


2. Bendig, A. W. “A Q- Technique Study of the 
Professional Interests of Psychologists. "' 
(To be published. ) 


3. Bendig, A. W. ‘‘The Use of Student Rating 
Scales in the Evaluation of Instructors in 
Introductory Psychology," Journal of Ed- 
ucational Psychology. (In press.) 


4. Crannell, C. W. “An Experiment in the Raț- 
ing of Instructors by Their Students, ” Col- 
lege and University, XXIII (1948), pp. 5- 
11. 


5. Guilford, J. P. and Holley, J. W. “A Fac- 
torial Approach to the Analysis of Variances 
in Esthetic Judgments, ” Journal of Experi- 


mental Psychology, XXXIX (1949), pp. 208- 
218. 


6. Holley, J. W. and Buxton, C. E. ‘‘A Factor- 


ial Study of Beliefs, ’’ Educational and PSY" 


chological Measurement, X (1950), pp. 4007 
410. 


7. Kelley, F. J. Toward Better College Teach- 
ing (Washington, D.C.: U. S. Government 
Printing Office, 1950). 


8. Kendall, M. G. Rank Correlation Methods 


wank vorreliation Metu 


(London: Griffin, 1948). 


9. Michael, W. B., Zimmerman, W. S. and 
Guilford, J. P. ‘‘An Investigation of TWO 
Hypotheses Regarding the Nature of the 3 
Spatial-Relations and Visualization Fac 


tors,'' Educational and Psychological Mea? 
urement, X (1950), pp. 187-213. 


10. Thomson, G. The Factorial Analysis of d 
an Ability, 3rd ed., (New York: Houghto 
Mifflin, 1948). 


- 

11. Wright, R. E. “A Factor Analysis of the ® 
iginal Stanford-Binet Scale, " Psychome 
rika, IV (1939), pp. 209-220. 


JUDGMENTS BY 820 COLLEGE EXECUTIVES 
OF TRAITS DESIRABLE IN LOWER- 
DIVISION COLLEGE TEACHERS 


M. R. TRABUE 
Pennsylvania State College 
State College, Pennsylvania 


amount of weight usually given to that trait when 
employing a new instructor or assistant profes- 
sor. They were also asked to make a check ina 
fourth column opposite any desirable trait ‘‘of 
which you have rarely found evidences on the cre- 
dentials of applicants for teaching positions in 


your institution. € 
A number of special reports have been publish- 
the data reported have 


τς Ma ERICAN Association of Colleges 
Wee pee cce appointed in February, 
tee on S ος e. of its standing Commit- 
he Prim ies and Standards to make a study of 
ανα ο of college teachers". This sub- 
that its e, after a number of meetings, decided 
ἢ chief effort would be to try to clarify the 


lrecti 1 
* MM in which changes should be made in 
t the v preparation of college teachers, and 
qualifications available source of data on current 
ομως would be the judgments of college 
College t S who regularly employ newly-prepared 
eachers. Since most young teachers be- 


gin thei 
duetory , careers on the faculty by teaching intro- 
i Courses to first and second year students, 


it se 
c κ... pbpzopninte to focus the inquiry on the 
lon coll raits possessed by these lower-divis- 
AG ege teachers. 
lisheq zn review of previous studies and pub- 
Boios eee provided a long list of traits and 
member patterns that students, fellow faculty 
importan? alumni, and others had considered 
items ios in college teachers. ΒΥ eliminating 
COmbjni t had little real evidence behind them, 
tently ας rephrasing items that were suffic- 
that Se lmilar, and grouping together the items 
a list μοίρα to refer to rela 
in Ingui fifty-two traits was obtain 
Statem ege executives were asked to read each 
Cheap ent carefully and to indicate, by making a 
the 


edand printed 


ted types of activities, 


ed, but in each of them 
been upon those particular traits which were rat- 
ed **highly important” by 50 percent or more of 
the special groups of college executives to which 
the report was addressed. 4 In order to provide 
a complete record, so that other investigators 
may be able to use the data intelligently, itseems 
desirable to publish now the original form (Inquiry 
A) used, including not only the printed instruc- 
tions and the fifty-two traits, but also the 
ith which the 820 college executives 
h of the four columns opposite each 
trait. Inthis tabulation, for example, only 10 of 
the 820 executives reported that they considered 
item Ia (‘‘General academic record is high") as 
having ‘‘Little Value" (undesirable or not very 
important), while 465 of them considered it as 
having “Real Value" (important), 345 checked 
itas having “Great Value" (highly important), 
and only 12 of them reported that this item was 
“Rarely Noted'' in the credentials of applicants 
ons in their institutions. 


for teaching positi 
The most important findings that do not appear 
in the data reported here are differences among 


checked eac 


Ck É 
mark in one of three printed columns, 
University of Minnesota; Dr. Karl 


1 
* The 
Bigot De members of the Committee are Dr. Ruth E. Eckert, 
dent ον Teachers College, Columbia University; Dean L. D. Haskey; University of Texas; Pres- 
John R. Emens, Ball State Teachers College; and President 8. M. Brownell, New Haven State 


S&Cherg College. 
eferred by Executives of Teacher Educa- 


n College Teachers Pr 
on of Colleges for Teacher Education; 


"Ch; 
aracteristics of Lower Divisio r 
k, American Associati 


tio 

νηκημ ," Third Yearboo 

Ν a, Ne Y., 1950. Pp. 67-14 
har 

Ase acteristics of College Instructors 


Soc 
lation of American Colleges, XXXVI, 


Desired by Liberal Arts College Presidents," Bulletin, 
Sie (October 1950), pp. 211-212. ccc d 


No. 2 
ss?! Junior College Journal, XXI, No. 3 (No- 


"What 
Verben aits Should Junior College Teachers Posse 
1950), pp. 110-12. 


E 


338 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI 


Responses by 820 College Executives 


PREPARATION OF COLLEGE TEACHERS 


Inquiry A 


This inquiry is to be checked by the chief employing officer (President or Dean) of the college. 


INSTRUCTIONS 


In checking the traits and experiences listed below as **Qualifications of College Teachers," 
please indicate your judgment of the practical im 


of an applicant for a position as Instructor 
Or Assistant Professor to teach lower-division classes? 
1. Make a check mark (V) in the first column (Little Value) if you consider the item either as 
“undesirable” or as «έπος very important, ’’ 


2. Check the item in the second column (Real Value) if you consider it ‘‘important, ” 


3. Check the item in the third column (Great Value) if you consider it “highly important, ” 


In the fourth column (Rarely Noted), make a second check mark oppos 
ready checked as having “Real Value" 


dences in the credentials of applicants 


ite any item you have al- 
or “Great Value”, but of which you have rarely found evi- 


for teaching positions in your institution. 


E Added Spaces are provided at the end of each section in which to list and to rate additional qual- 
ifications for which you always look when employing teachers for lower-division college classes. 


Relative Importance 


Qualifications of College Teachers 1 2 3 


Little ^ Real Great pone 
Value Value Value o 
1. As a Scholar 
a. General academic record is high 10 465 345 12 
b. Academic record in his special field is unusually 
high 26 380 414 11 
c. Has done important research in his field 251 460 109 62 
d. Has published scholarly articles or books 273 461 86 55 
e. Contributes to meetings of professional and 
scholarly societies 118 577 125 86 
f. Has earned doctor's degree 119 454 247 12 
g. Holds a graduate degree from a ‘noted univer- 
sity” 317 405 13 
3 98 
h. Graduate major was ina Special area of an aca- 


demic subject (e, &-, Modern European History; 
Colloidal Chemistry; etc.) 


ὃ . 496 258 66 31 
i. Graduate major covered all important divisions 

of his academic Subject (e. &., History; Physics; 

Psychology; etc. ) 44 447 329 59 


Jun 
μὲ, TRABUE 


j. Graduate study included all divisions of his sub- 
ject plus extensive work in another broad field 


As a Teacher 

a. Understands the problems most often met by col- 
lege students in their work 

b. Has studied problems of college teaching and of 
its evaluation 

c. Has successfully taught his subject in college 

d. Has studied the objectives of ‘‘general education’? 
for college students 

e. Has successfully taught college courses for their 
“general education” values 

f. Has been successful as elementary or secondary 
school teacher 

g. Organizes materials and prepares carefully for 
each meeting with class 

h. Inspires students to think for themselves and to 

_ express their own ideas sincerely 

i. Leads students to take responsibility for planning 

_ and checking their own progress 

j. Has demonstrated skill in methods of instruction 
appropriate to his field 


As a Student Counselor 

a. Is friendly, democra 
his relations with students 

b. Assists students to collect, analyze, 
data on their own vital problems | 

c. His students voluntarily seek his advice on intim- 
ate personal problems . . 

d. Has studied the techniques of diagnosis and guid- 
ance of college students 

€. Has demonstrated unusua 
counselor of college students 

f. Has been successful as leader of young people 
in scouting, club work, camping, etc. 


tic, tolerant, and helpful in 


and evaluate 


1 competence as 8 


As a College Faculty Member ΚΝ 
cial interests, abilities, and 


a. Has studied the spe 
needs of college students g 

b. Has studied the purposes, curricula, organiza- 
tion, and procedures of higher education 

C. Has participated constructively in departmental 
and general faculty meetings . 

d. Contributes effectively in committee planning and 
work 

€. Takes broad (rather than departmental) view of 
educational problems 4 

f. Understands the contributions of college instruc- 


1 
Little 
Value 


88 


11 


47 


108 


2 
Real 
Value 


391 


166 


444 
354 


489 
463 
373 
169 

58 
243 
281 
211 


149 
373 
398 
522 
430 
506 


435 
478 
488 
496 
245 


3 
Great 
Value 


341 


654 


297 
439 


259 
238 
158 
648 
761 
566 
530 


562 


670 
423 
330 
190 
330 

82 


352 
281 
292 
299 
571 


172 
154 


83 


229 
247 
169 
151 
279 


JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI 


1 2 3 
Little Real Great Rarely 
Value Value Value Noted 

tion in other fields than his own 15 459 346 258 
g. Regards himself as primarily a college teacher 39 

(rather than as a subject-matter specialist) 31 251 538 2 
h. Shows active interest in continued professional 63 

study 10 303 507 
bs SHOR S eee ο ΠΟ. νι — ssi TEM id 
ας Ὕσο να ENEA ο ο ο 

V. As a Person 28 
a. Has good health and physical vigor 4 382 434 0 
b. Is emotionally stable and mature 1 106 713 1 
c. Has genial personality and sense of humor 4 329 487 70 
d. Is always neat and well groomed 36 551 233 45 
e. Has a wholesome family life 13 402 405 116 
f. Is less than 35 years of age 586 214 20 30 
g. His behavior reflects high ideals 2 213 605 67 
|J 1 ME T" sata see 
η μμ. 
VI. AS a Citizen 

a. Is at ease in social situations 34 587 199 121 
b. Is well informed on current events 32 589 199 163 
c. Participates in cultural activities (art, music, 

literature, etc.) 92 626 102 81 
d. Takes a part in religious activities 135 468 217 TI 
€. Is active in civic and welfare groups 118 620 82 7" 
f. Engages actively in political work 630 181 9 135 
g. Holds fair-minded attitudes on controversial 

issues 17 404 399 167 
h. Is an effective public speaker 118 564 138 108 
i. Was successful in a non-academic job 390 375 55 108 
j. Has gained a cosmopolitan outlook through 
7 travel and wide reading 62 593 165 (8 
κ ο noU coin Ὁ e 


ich your ratings (above) are probably most valid. 
Teachers college...... School of Education... . |. Community or junior college...... Liberal 
Arts College... .., Other...... 


Any comments re 
helpful. Please indi 


ο μωμ Address 


Mail when completed to M. R. Trabue, 102 Burrows Building, State College, Pa. 


June, 1953) 


the executives of different types of colleges and 
differences among those whose institutions ar e 
located in different sections of the United States. 
The size of these differences on most items was 
Surprisingly small.3 In the table published on 
Page 136 of the June, 1951, issue of the Journal 
digeacher Education, for example, the mean 
i ferences between the extreme percentages of 
highly important” ratings on the same items 
lo BEN by executives of junior colleges, col- 
€ges for teacher education, and liberal arts col- 
leges was 12.3 percent. The mean difference be- 


J. The writer Will be glad to supply the basic data on these difference 


Who has real need for them. 


TRABUE 341 


tween the extreme percentages by executives of 
colleges in the four different sections of the 
country was only 7.3 percent. While these dif- 
ferences reflect somewhat different philosophies 
of higher education among college executives in 
different parts of the country and in different 
types of institutions, the important facts are the 
differences among the traits themselves, as re- 
ported here in the combined ratings of 419 liberal 
arts college presidents, 204 junior college pres- 
idents, and 197 executives of colleges for teach- 


er education. 


s to any research worker 


» Journal of Experimental Education 


Volume XXII 
September, 1953 Number 1 


PERFORMANCE IN A VERBAL ADDITION 
TASK RELATED TO PRE-EXPERIMENT- 
l AL*SET^ AND VERBAL NOISE" 


E. VICTOR MECH ** 
Indiana University 


Introduction problem bears a relationship to experiments in 
T which human subjects were used who were cap- 
sheen EMPHASIS which is given to noise able of imposing self instructions or to respond 
ion lo campaigns implies a common assump- to minimal cues. This would include those ex- 
t This į t noise is a hindrance to effective wor k. periments purporting to demonstrate that one ed- A 
Ar interest in controlling noise has, however, ucational method is better than another, or that 
Vias OR the empirical evidence available. In one method of school learning is more efficient Ld 
Fe eu. this widespread interest, we might reas- than another. 
grounded ον to find a considerable body of well ^ 
ic of noi scientific literature relevant to the top- Design 
nfort ftat and how it affects the behavior of man. I 
With the ο there is a lack of evidence ποίη ἢ Four groups of subjects were used, with 15 
Provide hi ba lom at the human level interms whic subjects assigned to each group. A total of 60S's 
sis for generalization. . being used in the experiment. Each group work- 
: ed eight days working under 2 conditions eachday, 


Sc 
one a noise condition, the other a quiet condition. 


Ope 
of the Investigation 

Each condition on each day consisted of fifteen 60- 
ods. Thus each individual sub- 


The as 
the a ra was concerned with pesas second work perl 
©oncernin esis that a set of formal instruc dons ject served a total of four work hours in the ex- 
1 E influ Soe See matter of the experiment periment, over the eight day period. 
Tesults ence the results obtained, when these The experimental ig tions for the four groups 
ividy ο 6 defined as the performance of an in- € 
ual 1 REA were as follows: 
Stateq ,. - ἃ Specific task. The problem can be 
Cee precisely in the following manner. E, : 
Syste; it be demonstrated experimentally that Group A. — The 85 in this group were told 
e Matic performance changes are possible in only that the experiment concerned the effects of 
Pend ence of noise and quiet conditions, de- noise on work. They were told nothing in relation 
noise € upon whether the subject expected the to whether noise should facilitate or inhibit their 
be an aid or hindrance to subsequent per- performance. S’s were not informed of their daily 
progress under noise and quiet conditions. 


ormar 
ce? In its broader aspects the specific 
ey 
x is employed here to relate the formal Shetruétionsyeiven © and performance esi sensed’ 
hese instructions. 


ute 

S paper 1 to the School of Education, Indiana University, i 

P S adapted from a thesis resented to u 9 ersity, in 

art fulfillment of the Ph.D. dec: I am indebted to Dr. William H. Fox, whose encourage- 
ue in the execution of the thesis investigation; to Dr. 


Ment and sw 
ipport we: vali 
im Goladerel vora ot ER tiny Ὁ soe incisive comments and criticism aided materially 
Psycho] ντε certain formal aspects of the analysis; to Dr. R. C. Davis and Dr. Delton Beir,’ 
on this 5 Department, Indiana University, who contributed in no small measure to my thinking 
Indian, Pro"lem, and finally to Dr. Nicholas Fattu, Director, Institute of Educational Research, 
ana University, for acting in an advisory capacity with respect to the quantitative analysis. 


Group B. —S's in this group were told thatthe 
experiment concerned the effects of noise on work. 
They were shown a faked work curve graph, which 
was presented to them as genuine, and that the re- 
sults were from a previous experiment in this 
area. These faked curves indicated thatina pre- 
vious experiment other S’s had performed Better 
under the Noise conditions, than under the quiet 
conditions. The noise work curve (blue) was al- 
Ways above the quiet curve (red), each curve dem- 
onstrating the negative acceleration which is char- 
acteristic of some learning curves. As in Group 
A, the subjects in Group B were not informed of 
their daily results, Inquiries of this sort by in- 
dividual S’s were met with ambiguous replies. 
Group C. — These S's were treated the same 
as those in Group B, except that the faked work 
curves demonstrated that previous S's had per- 
formed Better under the Quiet conditions. Asin 
Groups A and B, the S's in this group were not 
informed of their daily progress or of «How they 
were doing. 2 
Group D. — The S's in this group were treated 
the same as those in Group C, except that the 
faked work curves demonstrated that previous S's 
had at first performed Better under Quiet con- 
ditions for the first three or four days and then 
gradually getting accustomed to the noise, per- 
formed Better under the Noise conditions for the 
last three or four days. As with the other three 
£roups, S's in Group D were not informed of their 
daily Progress, i.e., whether or not they were 


getting different results under the varying con- 
ditions. 


Procedure 


As was specified in the design, the S’s were 
Shown faked work curves appropriate to the group 
to which they were randomly assigned. This is 
not to imply that by merely looking at the graphs, 
that the S’s would be influenced by them. As a 
precautionary measure an attempt was made to 
Standardize the presentation of the stimuli, 

The following procedure was used: A set of 
formal instructions was presented to the subject 
upon entrance into the experimental room. Each 
S was instructed to read these aloud; then to study 
the graph, and then to give in his or her own words 
a Short interpretation of what had been read, and 
of the work curves on the graph. The taskinstruc- 
tions were presented to all 8’5 on the Day 1 trial, 
in order to obtain an initial performance meas- 
ure; the different experimental instructions were 
not presented until prior to the Day 2 trials, 

The instructions were actually nothing more 
than a verbal description of what each set of graphs 
demonstrated. In this way, each S had atleast 
verbalized what was contained in the Written in- 
Structions, and on the graph. In this manner the 
experimenter was able to reinforce the formal 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. 22 


material, with a comment—that the subject was 
expected to perform in a certain manner because 
previous subjects had performed in that way: 
Each subject was also told that at the end of each 
30 minute session he would be asked whether he 
had performed better under the noise or thequiet 
condition. The appropriate spurious graph was 
kept on a desk in front of each S during the seven 
Sessions served under experimental conditions; 
this refers to Groups B, C, and D, as Group Α 
was not oriented by the experimenter in either 
direction. There was a 72 hour interval between 
the fifth and sixth replications, otherwise there , 
was a 24 hour interval between the other replica 
tions. 

It is pertinent to point out again, that the ex^ 
perimental conditions, as measured byavari- — 
able that we shall call Y, were not introduced sd 
til immediately prior to the second session. Pr a 
to obtaining the measures of Y under experimen 
al conditions, each 5 was given a preliminary 3 
minute Session, under both noise and quiet σος” 
ditions, i.e., 15 minutes allotted to each con^ 
dition. These preliminary measures we shall 
call X. 

Broadly stated, the hypothesis that we wish 
to test is that there are no differences in the ef- 
fects of the different Sets of instructions, and 
that any differences in final mean scores of the 
experimental groups, after allowances have peer. 
made for chance differences in initial mean scor ue 
are due primarily to chance fluctuations in samP 
ling procedures. The allowances for initial dif- 
ferences were made in terms of the regression = 
final on initial measures. These operations wet 
included in the analysis of covariance which was 
the primary statistical technique utilized in thé 
analysis of the data, 


Experimental Task and Noise Conditions 


The task consisted of adding 6, 7, 8, and 3 
Successively to a given two-place number, an 
then repeating this operation for 60 seconds. — 

The addition would then be stopped by the ex” 
Perimenter, and S given another two-place mo 
ber. Each of these 60 Seconds of addition will 
called a ‘problem.’ Each S was told to add aS 
rapidly and accurately as possible; and, that s 
would be given credit for the total number of 601, 


noise and 15 meas 
each day. For th 
each day, the s 
minute period 
Same proced 


ures under quiet for each 9; is 
€ 15 minute session under = 
um of the responses in each ΟΠ e 
Was obtained and totaled, and aer 
ure was used for the sa.ne S un 


Septe 
eptember 1953) — 


" My 
TABLE I 
THE 
COMPOSITE PERFORMANCE SCORES OF FOUR GROUPS OF SUBJECTS UNDER 
NOISE AND QUIET CONDITIONS FOR EIGHT SESSIONS 
| Day 
te Group Condition 1 2 3 4 5 6 7 8 
A Quiet 2581 3001 3539 3959 4189 46535 4697 4209 
Noise 2294 3029 3364 4050 4092 4504 4529 4160 
B Quiet 2806 2700 2988 3451 3302 3830 3643 4207 
Noise 2590 3743 3664 4301 4587 5257 4876 56372 
c Quiet 2371 3348 3961 4123 5287 4643 5156 4968 
Noise 2976 2702 3156 3300 3844 3782 4016 3934 
D Quiet 2612 3122 3312 3681 3197 4140 4083 4208 


Noise 


TABLE I 
UR GROUPS OF S's 


COVARIANCE F VALUES UNDER NOISE FOR THE FO 


Days 1-2 L5 1-6 1-8 
E 
7.87* 3.81* 1.26 9. 96* 


LEN es re 


TABLE III 


VALUES UNDER QUIET FOR THE FOUR GROUPS OF 855 


COVARIANCE F 


| Days 1-5 1-5 1-6 1-8 
6. 47* 4.30* 1.13 


F 2.68 


Correct Responses 


JOURNAL OF EXPERIMENTAL EDUCATION 


Figure 1, p, 
Day Period 


erformance Curves of & Groups of 5: 


5 under NOISE for an B 


(Vol. 22 


September 1953) MECH 


| 
| 


τμ 


6000 
5000 
4000 


3000 


Correct Responses 


2000 


1000 


Days 


UIET for an 
2. Performance Curves of 4 Groups of S's under Ql 
Figure 2. e 


B Day Period 


quiet. Thus each S had two performance scores 
for each day. 


Operationally Defined Noise Conditions 


The verbal noise used in the experiment Was 
Volume 1: ‘‘I Can Hear It Now, ’’ side number 
one. This record contains verbal narrations of 
important news events between the 1933-1945 
era. The recording was played at 


ions. 


Select 


ion and Random Arrangement of Subjects 


The 60 subjects used in the experiment were 
from an undergraduate section i 
chology, Participation in t 


cording to the Schema 
on this page, in which Ax B, C, and D represent 
the different "treatments, ^» 


Results 


Osite Quiet, there are record- 

hese represent the total 
or the 15 Subjects in that 
the quiet conditions. 


Intergroups Analysis 


Figures 1 and 2 demonstrate eraphically the 
data which are recorded in Table I. Specifically, 
d with respect 


oise condition 
independent from performance chan 


Quiet condition. It is clear from Fj 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. 22 


SCHEMA: SYSTEMATIC VARIATION 
OF INSTRUCTIONS 


ζη 
(D 
cr 
Nn 
| 
ζη 
(D 
cr 
ζω 
(D 
cr 


QU » Ὁ ΩΩ ο » Uou»- Uno us» 
> Ow OP Uuwo»- bvo tu og 
U è a W br Ow Yd ο uto pb alf 
U OU» Dave wasn wpyo yla 


not 
to the different experimenta] ‘Set’ which umm 
introduced until Day 2. Upon examination 0 


- 
- An analysis of COvariance was το 
puted for Day 2, Day 5, Day 6, and Day 8 to es 
termine Whether the existing gross differenc ο” 


:or t 
asure was obtained praa n 
the introduction of experimenta] conditions on 
2, 


ὶ ta^ 
S. The asterisk indicates 5 
eat.05 level. 


Intra-Groy Trends: Cross-Over Analysis of 
--------565Ίρῃη 


The next operation was to consider the 1 
Sroup differences jn relation to performance 


Ππεοτπασς5 JOU «99 ^] = 5γαϑτπη8Θ:1} 10] OT?€J- A 


MECH 


8T 61018 GI TROL 
v£'£6 GO 099 9 αοααῃ 

2 τι] vo τι] I SUIWEL, 
IZ'ezI IZ'ezi I (puooes SNSIOA 15.411) SMOY 

6S 'IGcG 89 τζζ96 Á (5αοτγεοτ]{9α) suuinjoD 
soienbg u?9]A so1vnbg Jo wing "E" UOT]ELIEA JO 921nog 


99UPIIE£A JO SISATeUY 
I ταυσο -ς 'I002N :51Έ1ο1 1uourjeo 1] 


v9 “SSOP £6 “LSS 90 “ST9 L6 609 90 'e66 £6'££68 O02 O97 00'c0v 66 PZE SIL 


September 1953) 


€L°S00Z 09°0820 £6'TOEN 006068 O08'ZLZN ο οσο Lo vecN LO'00ZO E6°ZSIN puooog 


τορος  ES'LLZN £I'eI1e6O LZ'OOEN ο 6ιζὸ OO'OLZN £6'cecO €6'IOZN 905119 19111 


STROLL 8 L 9 G Y 6 ζ I SMOY 


Suone dəy 


LAINO GNV ASION YAANN S.S V dNOUD AO 
JONVINHOAXHWHd AHL ΟΝΙΗΥάΙΛΟΟ HOÀ AONVIYVA AO SISA'IVNV H3AO-SSOHO 


AI A'IAVL 


| 
o 
^ 
X ΄ 


ν.μ ΞΕ 


(Vol. 22 


JOURNAL OF EXPERIMENTAL EDUCATION 


Ἴ9Λ9] TO ' 15 1πεοτπιᾶτ5 «ος "Tg = 51πϑτπ1891} 101 Ο1181-1 


60 '90689 GT 11011, 
8h '606 £8 '9GpG 9 10115 
00 0096: 00 '0096I I Sjua uito 1L 
00 '908I 00 '908T I (puooes 5π519Λ 15111) SMOY 
LT 9009 06 ΕΡΟΣΡ à L εποητοπάθᾶ 


ΘΟΙΓΡΙΙΕΛ JO 5154Ι5ΙΥ 


PP '86119 ‘OE '64ΕσΝ :515101 3uoureo1], 


PL'LGIP 92:50,  9Z'TLG ϐἱ 309 £6'G86G G6L°9IS οὗ hh ες δ; — GL'6SE ST*10.L 


S9°S66T 9F'08ZO 90'GZEN εξ «ασ ϱ08”40ΕΝ 90'0E2O 96 ΡΟΝ 00'0810 99'GLIN puodvas 

II'P9iZ O8'PZPN O2'9PZO O9P'0SEN εἰ ὉζζΘ εὐ 98ξΝ 05 66ΙΌ εα ΥΕΝ 90'481O 1511. 

5151011, 8 L 9 G [4 £ ζ I ΕΔΟΧ 
SUOI dəy 


LAINO ANV ASION YANN 
S:S g dNOUD AO AONVWYOAJUAd AHL ONINVANWOD HOA NƏISAA ΠΠΛΟ-ΞΡΟΗ2 


A S'ISVIL 


_ 


September 1953) 


6000 


5000 


4000 


3000 


Correct Responses 


2000 


1000 


MECH 


A: 
Curves of Group 
Figure 3. Patrone 
and Quiet Sach ay 


Same S' 


g Working Under Noise 


10 


Correct Responses 


6000 


5000 


4000 


w 
o 
o 
o 


2000 


1000 


JOURNAL OF EXPERIMENTAL EDUCATION 


A——A Moise 


@----@ Quiet 


Figure 4, Performance Curves of Gr B 
Under Both NOISE and QUIST Zach Day μαμα Saria Sta Maning 


(Vol. 22 


ER. 


September 1953) 


Do Noise and Quiet conditions. Figures 3, 

we a and 6 present the performance curves of 

ene respective groups, i.e., the same S’s 

Pm under both conditions. The comparative 

ies or each group are based upon the data 

5 in Table I, the raw scores having been con- 

erted to mean scores. 

νο an analysis of variance was com- 

ed in order to determine the following points: 


"EAS. A S's were not given any instructions 
pected ie the condition under which they were ex- 
céeded © perform better. Presumably, they pro- 
then τν Ὁ their own instructions. Basically, 
signiti e point in question is whether there is a 
Pec icant F value for trend toward better per- 
the ance under one condition (Noise) or toward 
Acn condition (Quiet). 
ed ir B S's were given specific formal in- 
τν. regarding the condition under which the 
Speci ώς expected them to perform better; 
v eclfically, the defined noise condition, compared 
Nr conditions. There was a trend, as isclear 
under í 6 s toward accelerated performance 
Sent a he noise condition. Does this trend repre- 
a statistically significant difference from 
condi) anes of the same S's under the quiet 
ions? 


gon Similar fashion the analysis was made for 
ih ta, C and D. Cochran and Cox! point out that 
c os logical assay a design has been used which 
ady ely resembles the latin square, but has some 
ret tae when the number of treatments is small. 
are nee case, which we have here, there 
been Ὁ treatments, noise and quiet, which have 
Com alternated in ABBA fashion. The design and 
i MPutations used in the analysis are presented 
Tables IV, V, VI, and VIL 
psu ο 3 presents the performance curves of 
Condit S’s working under both noise and quiet 
Varig ions. Table IV presents the analysis of 
of 7 ce for Group A. The F-ratio for treatment 
τὶ 90 was not significant at the . 05 level. 
is q Sore be observed that in Figure 6, there 
and 5 TOss-over in performance between Days 4 
ing in, with Group D subjects gradually increas- 
When eir response rate under the noise condition, 
Compared with their responses under the 
Xpecte n tion: In theory, this cross-over was 
the ted, and from the curves, Group D followed 
ance ‘end predicted. Since the analysis of vari- 
is a one-sided test, differences tend to dis- 


ap 
ta It is clear from Figure 6 


that d» on both sides. 
ifferences on each side of the cross-over 
em it was 


mer balance out. To handle this probl 
e Ese to use a sample arithmetical artifact. 
Peration consisted of obtaining the differ- 


l 
"Ww n 
* Ge Cochran, and Ge N. Cox, Experimental Designs 


uiet 


MECH T 


Sore bein performance under noise from that 
: quiet for Days 5, 6, 7, and 8, and subtract- 
ing these differences for each of the cited days 
from the performance means under Quiet. It 
WO n noted that the empirical data for Days 

y. Oy Ws and 8 show Group D performing better 
under Noise than under Quiet condition; the r e- 
sult of the transformat on was to eliminate the 
cross-over and place the performance curve for 
noise the same distance below that of the quiet 
curve, as it originally was above it. The next 
operation was to carry out the analysis of vari- 
ance which is summarized in Table VII. The F- 
ratio for treatments of 33. 70, for 1 and 6 degrees 
of freedom was statistically significant at the .01 
level. 
To summarize the results: (a) Group B sub- 
jects produced a significantly higher response 
rate under Noise conditions when compared with 
the three other groups; (b) Group C maintaineda 
higher response rate under Quiet, but the differ- 
ence was statistically significant at the . 05 level 
on only two of the four covariance analyses; (c) 
Groups B and C performance deteriorated some- 
what during the 72 hour interval between Days 5 
and 6; (d) in comparing the intra-group trends 
Group Α produced no significant differential r e - 
sponse rate in either direction. Groups B and C 
produced statistically significant differential over- 
all response rates. Group D also produced a sta- 
tistically significant differential response rate 
over the eight sessions. This difference was in- 
dicated after the arithmetical transformation had 
been completed in order to *teliminate" thecross- 
over. 

Discussion of the results. — Unfortunately, 
there is a lack of systematic data with which to 
compare the cited results. The reference here 
is to overt performance changes, and notto phys- 
iological measures such as GSR's or action po- 
tential measures, mention of which will be made 
later in the discussion. 
The intergroups analysis will be considered 

first. The instructions seemed to have an immed- 
iate effect upon the performance of the subjects. 
The various inversions of the groups from Day 1 - 
2 under Noise was significant at the . 05 level with 
an over-all F-ratio of 71.87. Group B was clearly 
trying harder under the Noise condition, while 
Group C appeared tobe depressing their perform- 
ance when the Noise condition was introd uc ed. 
The various groups maintained their relative rank 
order on the curves for Days 3 and 4 with Group 
D gradually increasing their response rate under 
Noise. This seems to be one explanation for the 
somewhat lowered covariance F value of 3. 81 be- 
tween Days 1 - b under Noise. The 555 in Group 
D, it will be recalled, were oriented by theform- 


(New York: John Wiley & Sons, 1950). 


JOURNAL ΟΕ EXPERIMENTAL EDUCATION (Vol. 22 


12 


Ἴ9Λ9] I0 15 JWROTFUZIS “FO "TT = Sjueunvol] 1ο] OT]EI-T 


vL'ISP6T GT 15101], 

88 “26 62 18934 9 1Ο0ΟΙΙΠ 
Gg 'c0€0T ασ ΟΟ5ΟΙ I 51Π9ΙΠ1ΈΘ.Ι, 

OT FII OI TII I (puodas snsieA 15.11) SMOY 
L8 “9LLP 0I'8£7$€ L suorjeor dea 
arenbs uvayy so1vnbg Jo umg . 1} Ῥ ΠΟΤΙΕΙΙΕΛ JO 92J1nOS 


ΘΟΠΈΤΙΈΛ JO SISATeUY 


OT ‘LSZ2O 'ITI'IG8IN :518101 qguourjeo.1J, 


Ip 801; 97 $66  9P'GIO9  99'T9G cL'809 9856;  OP'PLP εξ ευ 9F OSE 5ΤΕ1Ο:Ι, 


844106 ος Ἱεεὂ εἰ ILeN £€G8'60£0 9Z'9GZN 98 Ῥισὸο OFP'OIGN ος εσσὈ OFP'86IN X puooeg 


68260ξ — 92'79zN εἰ ενεὈ ET'ZSZN 9F'2SEO 00 ὍΖΟΝ 90°F9ZO εἰ ὍΘΙΝ 90°8STO 15111 
L 9 G Ῥ £ ó I SMOY 


STROLL 8 


suorjeor[daHd 


LAINO ANV ASION HON 
S.S 9 dOHD JO HONVIWHOJAH3d AHL ONIHVdWOO HOA NOISWG HWAO-SSOHO 


IA N'ISV.L 


19 


MECH 


September 1953) 


loA9| TO` Ve 1πσοτπιι3τ6 ΄0;, 66 = Sjueurjear) 101 ΟΠ 1- A 


————Ó—————À 5 7 M NE a ας αλκάνιο. ο ως δικια. ο 


8850612 GT 15101], 

96 111 79 ‘LOOT on ποσα 
00 ῬΤΟ0Ο9 00 'Y 109 T gjuoui]vo IL 
OL ESP OL ESP I (puooes 515194 1852111) SMOY 
G9 '6606 τα 89571 L suonveordeu 
o1€nbg ΙΕΟΙΝ so1enbg Jo ums ‘Ty ΠΟΙΙΡΙάΙΕΛ JO 90σπος 


ΘΌΠΈΤΙΈΛ IO 5ΙΞΑΊΣΙΨ 


c£ 06810 *CT'08S8IN :SIe?303 quoureod], 


06'0,7€ 66°L8F 08'609 00 LEP €6°€6E 9819 08:00 6L°69E EL THE STE}OL 


90 “8LLI εα 08ο OO'LEZN 00°9LZO 08'08IN OF 'SPZO 00'Z8IN ET°80ZO 09'49IN — Ppuooeg 


Ῥ8 2691 οὗ L0cN OZ'ZLZ© OO'IZZN ET'ETZO OP'ZZZN 08°0ZZO 99'I9IN εἰ PLTO 193111 
3Τ6101, 8 L 9 G Y 


5 6 " ME SAOH 


suoneordoyd 


LAND ANV ASION YANN 
S:S GdüOHD JO AJONVWUYOAUYAd AHL DNIHVdWOO YOA NOISHAG HAAO-SSOUD 


ΠΛ ΠΠΗΥ͂Ι, 


~ - i ^. 


14 


Correct Responses 


6000 


5000 


4000 


3000 


2000 


1000 


JOURNAL OF EXPERIMENTAL EDUCATION 


Figure 5, Perform 
NOISE and QUIT Zach Session Same S'e Working Under Both 


ence Curves of Group C: 


(Vol. 22 


ee 


September 1953) salis 


6000 


5000 
4000 
8 
a 
τ 
A 
8 3000 
» 
ο 
e 
[7 
h 
o 
A——A Noise 
2000 
e-----9 Quiet 
1000 


L king Under Both 
Fi 6. Performance Curves of Group D: Same S's Working 
forsa and QuIsT Zach Session 


15 


16 JOURNAL OF EXPERIMENTAL EDUCATION 


s; ions that they were expected to perform 
r the Noise condition after the first 3 or 
4 days, but ie ae to perform better at 

i ithout the Noise. 

ie value of 1. 2 between Days 1 - 6 under 
Noise was clearly not significant. This intro- 
duces an important factor. It will be recalled 
that between experimental Days 5 and 6 there 
was a 72 hour interval which can be termed the 
**weekend break" between Friday (Session 5) and 
Monday (Session 6). This result seemed to sug- 
gest that the orientation that the Group B S's 
were given toward performance under Noise had 
deteriorated to a large degree during this 72 hour 
interval. 

Group B performance decreased to a greater 
extent on Day 7, but on the final test Day 8, Group 
B performance was again significantly higher than 
that of the other three groups. One principle low 
level explanation of this occurrence might be that 
the S’s in Group B were aware that Day 8 was the 
last session in which they would be required to 
perform the routine task, and were saving their 
*'best" performance for that day. The S's inthis 
group were instructed that they should be doing 
progressively better under Noise on each trial, 
and it is entirely possible that they were attempt- 
ing not to *burn themselves out' before Day 8. 

In any case, Group B performance, on the 
whole, showed a Significant trend toward better 
performance under the Noise condition. This is 
particularly evident When the work curves under 
both Noise and Quiet for Group B are compared. 
The analysis of variance in the cross-over de- 
Sign shows an F value for the “treatments” of 
21.55 with a probability of less than .01. Figure 
4 shows Group B as having performed Slightly 
better under quiet conditions than under Noise 
on Day 1 (initial measure). This difference, how- 
ever, was not statistically Significant. On Day 
2 there was an inversion and Group B did consis- 
tently better with the Noise. It is not untenable 
to suppose that these S’s depressed their response 
rate with the record off. One female subject in 
this group reported that the reason for her better 
performance under Noise was that she was ‘com- 
pelled’ to add faster in order to keep from hear- 
ing the record, 

The evaluation of such judgments is clearly 
not within the scope of thi: 


S investigation. It is 
recognized that there is probably a multitudinous 


number of reasons that particular S's could give 
for their respective performances, However, in 
general, these could not be verified, Some typ- 
ical reasons that were given for what some S's 
thought was a poor showing were: ‘I’m tired 
from studying"'; “I have a test tomorrow, and 
can't think of anything else"; “I’m a pledge and 
my fraternity or sorority made me do such and 
such a thing today. ” 


(Vol. 22 


It is clear that the subjects used in the inves: 
tigation were not in the same physiological € 
dition from day to day. On one day as ubj prs 
would appear cheerful and enthusiastic abou fine 
task; on another day the same subject would e 
into the experimental room, and give the impr τ 
sion that this was just another routine duty to p 
form. Clearly, these judgments are made oa 
the experimenter’s standpoint, and should no i 
interpreted as absolute in any sense. . It is edt 
tirely possible that the investigator himself ee 
not in the same physiological state on succee 
days. j " 

od C subjects produced a higher daily vei 
Sponse rate under quiet conditions, but eem four 
were statistically significant in only two of t 
covariance analyses under the quiet πμ 
As was pointed out with Group B under ΠΡ der 
Group C performance under Quiet appeared to 
teriorate in the 72 hour interval between Days a 
and 6. This occurrence suggests that if moa 
tions are to be an important variable in this pan an 
ticular type of work task, that frequent σοκ 
ments with the particular set of instructions 1 ul 
desirable. If this condition is not fulfilled it wO 
appear that extinction of the desired effect wou 
Soon take place. 3 

An eiue phase of the-analysis was embod 
ied in the comparisons that were carried out } E 
tween the four groups, separately, to bees qam 
whether each group showed any significant ος 
to perform differentially better under one en 
tion than under the other. Group A were to et 
only that they were serving in an experiment id 
the effects of noise. The analysis shows πο 8 ide 
nificant trend in either direction in favor of s - 
or Quiet. This group, as with the others, E 
ably brought some bias into the experimenta m 
room toward one condition or the other. Th 
Was a slightly higher response rate for Group. 
under the Quiet conditions, but the difference, p 
not significant. The implication here is tha pee 
Subjects were not differentially disturbed by 
noise condition. $ 

Group B subjects, having been oriented ἘΝ. 
à higher response rate under noise, produce δν 
Significant trend in that direction. Noise, he? ©) 
Certainly had no inhibiting effects. Group CP 
duced a differentially significant trend toware 
better performance under the quiet condition, 
while Group D, which received the ‘complex k 
Structions, produced a significant trend toW? i 
better performance under quiet at first, with 


ise 
gradual increase in response rate under noi? 
toward the late 


T Sessions. 
Analysis of the data indicate that: 


d 


pject 
1. Formal instructions, or giving the SU p^ 
an orientation as to what direction to work in, 
pears to bea Significant variable in relatio? 


" d n— ———— —Ó— MÓ——À E 
—  —  ÀE— i,- 


———— 


September 1953) 


routi ` 
Face eben tasks. In order to produce 
με οπως significant results, and eliminate 
these ion of performance, the major property 
Clear cpi gerne seems to be that they be 
rection. , and orient the subject in a precise di- 
2. i 

τ αι the desired direction is produced, the 
tional “r seems to indicate that frequent instruc- 

insu ο cia ὃν are necessary to main- 
least cud a npe in that direction. At 

2 io i RUE 

Y^ employed. n to the routine verbal addition 
does In Noise, per se, of a given intensity, 
mm di appear to have any necgrsmy effect up- 
ing τ of routine work tasks. This find- 
ΙΙ. with some physiological 

at subj 
to : e noise, jects gradually become adapted 
tion e ere is no consistent difference in produc- 
al iuter no effort is made to give subjects form- 
der up ea with respect to the conditions un- 
they are expected to perform better. 


S 7 
Polating ations. —At the expense of over-extra- 
Clieves it om the empirical data the investigator 
er the ab advisable to make some comment, un- 
Cintas ho κα. "Although this investiga- 
is e classed as a **work'' experiment, 

e rubus dr has not been used to imply that 
also be į les important for work tasks would not 
there e norant for learning. Unfortunately, 

Althou few.systematic data in reference to this. 
With cont. gh the present investigation did not deal. 
reference rol over the subjects’ past experience, 

ata on D. might profitably be made to Harlow's 
Monkey uilding generalized ‘learning sets’ in 
method | All the studies used a non-correction 
Dingen eee somewhat analogous to the 

as to € Study, in which the S’s were not informed 
err eir progress, or when they had made an 
dy το Even though the monkeys Were necessar- 
τν Son on about 50 percent of their 
pom both able to learn object discriminations 

OW desi their failures and their successes. Har- 

gnates this learning how to learn à prob- 


firstchoices, 


MECH i 
7 


lem, as a learning ‘set, ’ and conclud 

knowledge of the nature of learning me hata 

t =m Hy to educational theory and 
e, as justi ) 

investigation. ο... 

: The acceptance of infra-human studie 

important advantage of allowing us to As. 

tain testable principles in relation to human be- 

havior. The difficulty here is that there is the 

verbalfactor to contend with, and the problem 

is more basic than Harlow proposes. 

One might be justified, at this point, to ask 
what can educational theory profitably derive 
from this discussion? There appears to be one 
obvious point: educational institutions, inagen- 
eralsense, are continually employing the *direc- 
tional' technique. Teachers at all levels are con- 
tinually directing (mostly verbally) learning: sit- 
uations with the hope that by virtue of this pro- 
cess the pupil has been aided in selecting the di- 


repetitive tasks, if per: 
tain direction are i 
facets of the problem unanswered: (1) Ar 
instructional effects relatively temporary? (2) 
‘Are schedules of instructions instrumental in 
producing differential effects; i. €., introducing 
the particular set of instructions during each 
period, or introducing them in intermittent fash- 
ion, or at an a-periodic interval? 

A note of caution should be introduced at this 
point with respect to the interpretations ofthe 
results of the many educational «methods?! inves- 
tigations. 

nvinced that one ed- 


A teacher might be so co 
ucational method was superior to another thathis 


enthusiasm for the method would so influence the 
subjects that they would actually do much better 
under it than under its competitor. But another 
educator could easily be so sincerely convinced 
of the superiority of the rival method that he in 
turn could prove his method to be the better one. 
Now which really is the better method? 


ο iiem mE "πα 


2 
"HS 
* Harlow, "The Formation of Learning Sets, 


psychological Reviews 


LVI (1949), pp. 51-65. 


αμα 


INTERGROUP ATTITUDES AND EXPERI- 
MENTAL CHANGE 


MARGARET L. HAYES 
MARY ELIZABETH CONKLIN 
New York State College for Teachers 
Albany, New York 


P 
Zürposes and Background 


D 
the New ᾽ THE years 1945-46 and 1946-47 
of the mes State College for Teachers was one 
&roup (ee ae in the College Study in Inter- 
erican Co lions which was sponsored by the Am- 
ational ms on Education and financed by the 
der the di onference of Christians and Jews, un- 
Study is pie cas of Dr. Lloyd A. Cook. This 
e facult ne of Several projects undertaken by 
this Colic. committee on intergroup education at 
Ποσο coo It was undertaken with the follow- 
ent of s s: (1) to determine the quality and 
attitudes oe desirable changes in intergroup 
ough ai can be brought about in adolescents 
Telatiye irected teaching; (2) to determine the 
Ing; (3) ο άνω of different types of teach- 
attitudes; find how response to attempts to change 
termined j^ related to membership in groups de- 
lous affil; y age, sex, intellectual level, relig- 
vi ilation, and cultural background. Attempts 


1 

Was wi Problem. Since the wor 
&thods different age levels and 

to disc of evaluation, it seems less confusing 

then 7 USS the work of each year separately and 


ring together the findings at the end. 


THE FIRST YEAR PROJECT 


Sus 
biects and setti 


The 
Proja, Schools which were asked to join in the 


mixt 
the e Οἱ Protestants, Catholics and Jew; 
Bisa consol- 


protestants 
d; nearly 
n-born. 


olc; 
the mag the campus school in connection with 
tg Je rs college; there is a larger percent- 
s r Hi eer children in this school than in the 
Choo pools, but the majority are Protestant. 
is a large city public school; ithas many 


children with foreign-born parents and many with 
poor economic backgrounds; there is a mixture 
of religious groups and this is the only one of the 
schools that has Negro students; also, many of 
the students are over-age. The study was con- 
fined to tenth grades in these schools and in each 
school there was an experimental group, (X), and 
a control group, (C). Both groups were taught, 
except in one instance, by the same teacher. There 
were 103 boys and 106 girls involved in the whole 
project. Teachers for this project were carefully 
chosen because of their interest in inter-group ed- 
ucation and their demonstrated expertness jn in- 
eacher in School A taught Biology; 


struction. Thet 
the two in School B taught Social Studies and Eng- 
lish, respectively; the teacher in School C taught 


English and the one in School D Social Studies. 


Procedures 


Several conferences were held with teachers 
to plan the organization and evaluation of the 
study; the national director assisted with some 
ofthese. The teachers working on the study 
agreed that the general aim was to improve in- 
tergroup relations in their classes and that the 
specific aims for pupils were: (1) togrow in 
social sensitivity and understanding concerning 
problems of relationships between economic, ra- 

oups; (2) to learn techniques 


cial and religious gr 

for living satisfactorily with members of other 

The participants felt that this involved 

seeing the problems of less favored peoples, be- 
ontributions regardless of 


reciative of C 
ieion or national origin, being aware of 


the ways by which other people are hurt, and see- 
ing people as individuals rather than as members 


of a class. 

Each teacher selected a different technique to 
use with the experimental class, and planned a 
unit of subject matter. Particular effort was 
made to find material which could be a natural 
part of the existing course of study. This was to 
prevent the anticipated criticism that the study 
was not concerned with the real business of the 
school and to make it possible to work for the im- 
provement of attitudes without arousing any un- 
necessary excitement in the pupils. To reduce 


90 JOURNAL OF EXPERIMENTAL EDUCATION 


ibility of the pupils regarding themselves 
Ὃ Sees Mise all pupils in the tenth grade 
were given the tests and in most cases the exper- 
imental unit was given to other pupils in the grade 
after the study was completed. 
A brief summary of the various methods used 
is given below: 


1. School A. The teacher used biographies of 
scientists in both X and C classes büt used 
for her X group scientists of different nation- 
alities and races. Each student read and re- 
ported on the contribution of some man or 
woman to science. Each student also made a 
Written report on source material. 

2. School BI. This unit was a study of ‘‘Relig- 
ions in America". The teacher used research 
followed by panels and later by informalclass 
discussions. The class agreed to use Protes- 
tant, Catholic, Jewish, Mormon, Quaker, and 
minor religious sects as their field of study. 

3. School ΕΠ. The teacher used the play, ‘‘The 
American Way” by Kaufman and Hart, already 
a part of the course of study, as the spring- 
board for discussion. The aim of instruction 
was to encourage thinking in terms of individ- 
uals rather than class through a recognition 
of the *American Way” of equal respect and 
Opportunity for persons of all races, creeds, 
and nationalities. The class read and decided 
to memorize “And No One Asked” by Morris 


Reich; later this was given as a choral read- 
ing in assembly. 


4. School C. The teacher in School C used a 


class discussion of Hortense Powdermaker's 
Probing Our Prejudices and followed this by 
class and individual reading of short Stories 
and novels, 

5. SchoolD. Lectures Were given on “The 
Growth of Democracy". The teacher Selected 
and emphasized those historical events which 
had a direct bearing on the aims of the exper- 
imental Study, such as the federal Constitu- 
tion, the Preamble, the Bill of Rights and the 
Amendments, Considered also were the set- 
backs to democracy, Such as the poll tax. 
Anti-discrimination laws and the F. E. P.C. 
were also included. τὲ will be noted that in 
Schools A, BI, and D the approach could be 
characterized as intellectual, in Schools BIT 


and C much use was made of vicarious exper- 
ience. 


Two kinds of evaluation techniques were used 
with the ten sections involved in the project. Two 
tests were part of the **Socia] Problem Analysis, 


1. This is a mimeographed test s 
University. 


2. Published by Ohio State University, 


eries obtained from the Bureau of Edu 


(Vol. 22 


Advanced Series’’.1 One of these tests dealt with 
attitudes toward problems of the Negro and M 
other with attitudes on slum problems and e ac 
had four subdivisions: (1) Reasons for delay end 
attacking the problem; (2) scope of the eer 
(3) what to do about the problem; and (4) hope a 
progress in solving the problem. The scor μα 
values were obtained by combining the judgme 

of the teachers involved in the study with ihono 
of the members of the faculty committee on τοι 
group education at the college. The second Ac 
nique was a sociometric device, ‘‘The Socia ato 
ceptance Scale’’.2 It required pupils to m 
their choices of people they would like best a e 
friends and those they would not care to have 3 
Íriends. A data sheet was filled out for each 
child giving age, nationality, race, religion, ορ. 
birthplace of parents, father's occupation, ends 
omic background, mental level and other πη Dd 
fying data. The two social analysis tests V 
Social Acceptance Scale were administered μα 
the pupils during the last week in October. ΞΕ 
experimental teaching was then done for thr 
weeks and'the pupils tested again. à 


Analysis 


The analysis that follows is in terms of bare 
changes took place during the period of expor 
mental teaching. The test data are treated i 
then the sociometric data, The data are m 
ined in terms of: experimental and o ap 
methods used, and groups determined by The 
School, cultural groups, and sex of pupils. tfrom 
following groups were considered importan es^ 
the standpoint of cultural background and SD at 
ignated: children whose parents were με, 
this country (old-stock Americans) O; iei 
one or both of whose parents were foreign ns, 5 
(new-stock Americans) F; Jewish America tion: 
and Negro Americans, N. These classifica e 
are functional rather than logical. Pupils 
considered under-age if they were more t expect 
year younger than 15 years 2 months, the e more 
ed age for this &roup; over-age if they Wer’ ig- 
than a year older than the expected age. age; 
ren with IQ's 90-110 were considered uv 
those above this level high; those below, 10 on 

Table I gives average changes in scores eri^ 
Social Analysis Tests by schools and by exP 
mental and control groups. jn eac” 

Table II gives the percentage of pupils "1 by 
group who improved in attitudes as indicat 
Scores on the Social Problem Analysis Tests. 

The total x Sroup made a change of +1- fet 
total attitude; the c group made a change © 


ups, 


Cational Research, Ohio State 


—, 


September 19 
3) HAYES - CONKLIN aA 


TABLE I 


AVERAGE CHANGES IN SCORES ON SOCIAL ANALYSIS TESTS BY SCHOOLS 
AND EXPERIMENTAL AND CONTROL GROUPS 


Slum Problems 


Negro Problems 


8 41.0 MEE: DIN s T 6 -19 -3.3 ΕΠ 
Siad. 0 .9 41.4 32.5 go -.2 - .3 41.6 41.8 44.2 
m NE θ -.3 ü ei | 61 “ποθ 4 14 ore edis BGS) πρι 

O + .2 2.9 P οὶ} dest 0 ΠΝ, ολη NET 
10 -1.6 25 dau -.4 e) 32 0 εν ο) 41.3 +i 9, 
41.2 Ju imu T ο ο πὲ |ι atas des cem 22-02 τρις - 9 
SEE. + .8 -.5 $240 #9 0. e 339 2.29) 95}. «255 
= d πια οὗ 2d ᾳαὰ | 128. Πα. ZEE P ur πο 0 
= 5 - .6 we κο 41.2 qur "ERA SENTI xi .2 41.0 41.6 
+.4 41.0 2 d5 «10 -.2 -.2 Dro Ap WO Tn 8 END 
cie 4.4 dud κ. 25 4.9 4.4 -.1 -.4 vel - .6 T 

0 ο. ο, ΙΙ πα .3 23. πα Isid 41.8 


TABLE II 


GES OF PUPILS WHO IMPROVED 


PERCENTA 
IN ATTITUDES ON SOCIAL PROBLEMS 

μες ο ματ ewm oo T at τρ 
Group Negro Slum Total 
AX 64 36 64 
AC 46 54 61 
BIX 41 68 60 
BIC 41 48 53 
BIIX 41 44 44 
BIIC 67 24 33 
cx 57 39 57 
cc 45 32 41 
DX 57 61 74 
DC 13 42 33 


ας, κ ο ως, 


22 JOURNAL OF EXPERIMENTAL EDUCATION 


e in the X group was great enough to indi- 
eed that it was not due to chance; that in the C 
group was not. For differences within the whole 
group an average difference of 1.0 gives T=1.95; 
this is at the 5% level of significance. In all 
schools there were larger percentages of pupils 
in the X groups who improved in attitudes than 
there were in the C groups. A consideration of 
Separate groups and separate problems shows 
many discrepancies; this is probably an indica- 
tion that there were important factors not under 
the control of the experimenters. SchoolC seem- 
ed to be most successful when both problems 
were considered together, followed by Schools 
D and B (Group I). Result from Schools A and 
B (Group H) were inconclusive. On the Negro 
problem School A was most successful, followed 
by SchoolC. On the slum problem School B, 
Groups I and II, secured best results, followed 
by School D; for other groups results were incon- 
clusive or indicated an actual loss in attitude. It 
would appear that the procedures for changing at- 
titudes were successful in an overall manner, but 
there is no clearcut evidence that one method is 
more successful than another. 

Table III shows the percentages of pupils by 


Sex in X and C groups who showed improvement 
in attitude. 


TABLE III 


PERCENTAGES OF BOYS AND GIRLS SHOW- 
ING IMPROVEMENT IN ATTITUDE ON SO- 
CIAL ANALYSIS PROBLEMS 


X σ 
Problem B G T B T 
Negro 40 60 50 53 26 14 
Slum 60 45 30 37 39 29 
Total 54 65 60 43 41: 29 


A larger percentage of girls improved in at- 
titude in the X group; there was little sex differ- 
ence in total scores in C group. With the excep- 
tion of boys on Negro problems, all X groups 
showed more improvement than C groups. 

Table IV gives the average changes in Scores 
on Social Problem Analysis Tests by Cultural 
groups. 

In all cases, cultural groups in the control 
sections were at the beginning of the study super- 
ior in the intergroup attitudes investigated to cul- 
tural groups in the experimental sections. In the 
beginning Jewish Americans displayed most fav- 


(Vol. 22 


orable attitudes in both X and C groups as com" 
pared to old-stock Americans and new-stock A s 
mericans, on both types of problems; the old-stoc K 
American group showed more favorable attitudes 


on Negro problems than did the new-stock Amer-: 


ican group. In total attitudes the new-stock A- A 
merican X group made the most gain, 5.4 points; 
the Jewish American X group came next with a " 
total gain of 4. 2 points; other total gains were . 
points or less. On the attitudes toward N am 
problems the greatest gain was by the new-stoc 
American X groups; the next greatest gain. was 
by the Tewish American X groups; little gain was 
made by the old-stock American group or by any 
of the control groups, with the possible excep 
tion of the old-stock American which showed ἃ 
change which approaches significance. Results 
on attitudes toward the slum problems followed 
the same pattern but were not as marked. κ * 

Table V shows percentages of pupils who im^ — 
proved in attitude according to intelligence group 
ings. 

The high and average intelligence X i. 
both contained a higher percentage of pupils W $e 
improved in attitude than did the C high and eH 
age groups. There seemed to be little αμα 
in the amount of improvement made by high an 
average intelligence groups. There were ΟΠ ἃ 
nine pupils in the low intelligence group; this 1 
probably too few to warrant any conclusions. 5 

The sociometric data was used to reveal Ha, 
child's adjustment to the social group which C 
Stituted the section of which he was à memben ya 
It sheds light on intergroup attitudes when gros ule 
include in their choices members of different, $07 
tural groups. Table VI is a summary table ο o 
ciometric data obtained for all the groups of thi 
study. Results are group averages. Column 2 
indicates changes in the average numbers of ne 
ceptances of others or group expansiveness} né 
umn 4 gives the changes in the average num | 
of rejections of others; column 5 contains τ ορ” 
changes in the average numbers of mutual 30 rage 
tances; column 6 gives the changes in the —— e 
numbers of mutual rejections; column 7 indict. 
the changes in the numbers of cases where ie " 
is extended but not returned. Since the numb? m^ 
of first choices is somewhat affected by the ^^. 
ber of people for whom choices of one kin a - 
other must be given, the last three columns ©° pd 
tain changes in the indices of liking, hostility sre 
of outgoingness respectively. These indice? t- 
obtained by dividing the total number of acceP |. 
ances and rejections of others by the total pein 
ble number of choices or rejections and com? 
ing these two figures for use in the last coU? 
Plus signs indicate an increase in numbers © jc 
ing the time of experiment; minus numbers Per- 
cate a decrease in numbers during the same that 
iod. In the discussion of the separate group” ^ 
follows this table is used for comparative p 


September 1953) 


AVERAGE CHANGES IN 


HAYES - CONELIN 


TABLE IV 


SCORES ON SOCIAL ANALYSIS TESTS BY 


CULTURAL GROUPS 


Cultural Old-Stock New-Stock Jewish 
Group American American American 
Group x ς Χ ς Χ E 
Number 87 69 11 20 11 7 
Negro ~ «8 41.2 $355 pat 42.7 0 
Slum 0 t1 41.9 -1.8 41.5 -1.1 


PERCENTAG 


TABLE V 
ES OF INTELLIGENCE GROUPS 


IMPROVING IN ATTITUDES 


High (over 110) 


Average (80-110) 


Problem x ς 
Negro 53 42 55 49 
Slum 53 40 48 42 


Total 


CHANGES IN GROUP OR 


Group No. Acc. Rej. M. 


z 

5ο 

œ 
ORORR AOI 


tP 


GANIZATION SHOWN BY S 


itait aitt 
m 
D 
+ 
p 
ον 


TABLE VI 


OCIOMETRIC DATA 


A. M.R. 


23 


94 JOURNAL OF EXPERIMENTAL EDUCATION 


ur 1 and figure 2 are sociometric charts 
showing the social organization of groups AX and 
AC respectively at the beginning of the experiment. 
Here, as in the charts that follow, pupils are re- 
ferred to by number and the following legend is 
used: 


choice of individual 


choice by individual . 


reciprocal choice 


Boy m Girl G 


Old-stock American, A; American of foreign 
parentage or new-stock American, F; Jewish 
American, J; Negro American, N. 


; Number 1, who was strongly 


ers 7 and 10. This 
is shown in Figure 2. During the instructional 


period this social organization developed further, 
as shown by increased emotional expansiveness 
and more positive reciprocation. Also, another ' 
leader, Number 12, arose. Group AC appeared 
to make more progress in social develo pment 
than did Group AX. This is Shown in Table VI 
and in Figure 3. 

Groups BIX and BIC showed a Similar and 
rather high degree of social development at the 
beginning. Group BIX became Somewhat better 
organized, as shown by an increase in the num- 
ber of mutual acceptances. Group BIIX was bet- 
ter organized than BUC; neither showed much 
change. Group CC was better organized at the 
beginning; Group CX gained more in liking and 
decreased more in hostility than did Gro up CC. 
The pattern of Group CX at the beginning is shown 
in Figure 4. There is a marked Sex cleavage. 
The two interlocking cliques of girls is interest- 
ing; clearly the Jewish American girl Nuinber 3 
is the key figure in the feminine group organiza- 
tion. Isolation and rejection in this group does 
not appear to be based on majority-minority group 
factors. y 

The two groups in School D presented in the 
beginning a marked contrast. Group DC was bet- 
ter developed socially; it had more emotional ex- 


(Vol. 22 


pansiveness, more mutual reciprocation and less 
mutual rejection and hostility in general. we 
DX made more progress in social Seren i 
than did Group DC. School D is the only one t * 
has a considerable number of Americans of for i 
eign-born parentage as well as Jewish rer 
and Negro Americans. The social pom on 
of Group DX is shown in Figure 5. There i 

sex IEEE, but not as marked as that show-a 
in Figure 3. Choices do not appear to be Pana 
on majority-minority group relationships to 
large extent. 


Findings from the First Year of the Project 


ore 
1. The experimental groups as a whole made m 


progress in the development of favorable κας 
tudes on the problems tested in the de 

in group social development than did the το 
groups as a whole. This was true of SET 
all individual groups; this indicates that — 
experimental teaching was successful inc 1 
ing attitudes in desirable ways. The Ἢ a 
change and the change in attitude toward E 
problems were significantly large; the hey 
in attitude toward slum problems may be du 
to chance. 


ed 
ο 
2. As to separate methods used, the one employ 


ive in 
in School C appears to be the most effective" 
changing attitudes in favorable directions: ^. 
will be recalled that this method was one ries 
ious experience through reading short sto r's 
and novels in connection with Powdermabe dis~ 
“Probing Our Prejudices”. However, t in- 
crepancies in scores from various ου 
dicate that πο clear-cut claim can be m a other; 
for the Superiority of one method over ample 
the results are affected by the type of pro fac" 
under consideration and probably by other 
.tors also. 


; itude* 
3. Girls improved more than did boys in attit 
toward Negro problems. 


ain? 
4. New-stock Americans made the greatest E t 
in the direction toward favorable attitudes: 


e 
Jewish American group made the next larg 
gain. 


;ns made 
5. There was no apparent difference ingains ™. 


with 
by pupils of high intelligence as compar 
those of average intelligence. 


THE SECOND YEAR PROJECT 


Subjects and Setting 


ighth 
The subjects for this project were the ο 
£rade pupils in five Schools in and near Al 


September 1953) 


BEGIN 


HAYES - CONKLIN 


FIG. ! 


SCHOOL A, GROUP X 
NING OF EXPERIMENT 


25 


AN 


SCHOOL A, GROUP C 
INNING or 


EXPERIMENT 


tC ŘĖÁ 


D 


N 


A 
d. 


3 4 
V { 
MESS 
à 


SCHOOL A, GROUP ο 
F T 


98 JOURNAL OF EXPERIMENTAL EDUCATION 


j These schools may be briefly de - 

- follows: School F is a consolidated 
village-rural school; the economic level is good; 
all pupils are Protestant; nearly all are of o 14 
American stock. SchoolG is a consolidated vil- 
lage-rural school; there are a few Jews and Cath- 
olics, but Protestants predominate; the economic 
levelis good. SchoolH isa large city junior high 
School; there is a mixture of cultural groups; the 
economic level is low; Protestants predominate 
but there are many Catholics. School J is a med- 
ium sized city School; nearly all pupils are of old 
American stock; there are about equal numbers 

of Catholics and Protestants, School K is a vil- 
lage school; pupils of old American stock predom- 
inate but there are quite a few whose parents are 
foreign born; Catholics predominate and thereis 
only one Jew in the grade. 

There were 117 boys and 115 girls included 
in the study, distributed in cultural and religious 


ents foreign born, or new stock American, (F) 
20; Jewish Americans, (J) 13; Negro Americans, 
(N) 14; Protestant, (P) 131; Catholic, (C) 78; re- 
ligion unknown, (?) 10. 


Procedures 
=tocedures 


In each school an experimental 
given special teaching and a control group (C) 


was not. Both £roups were tested at the begin- 
ning and end of a three weeks 


(X) group was 


at fundamental approaches, rather than Special- 
These categories are describ- 


1. Direct experience: members of various min- 
ority and Majority groups working and playing 


together in activities with common purposes 
and interests. 


^ 

2. Intellectual approach: The theory here is that 
if one knows the truth about characteristics 
and achievements of minority groups, preju- 
dice against them will disappear. 


3. Vicarious experience: Here one approaches 


real experience by projecting himself into the 
character of a minority group member by read- 


ing, acting or listening to Something of strong 
emotional nature. 


Α series of three meetings was held with the 
participating teachers and some Supervisors and 
principals to discuss methods, acquaint them with 
further materials and to plan evaluation. Instruc- 


5. School K. Set 


(Vol. 22 


tions to teachers were as follows: 


1. School F. Study the contributions of the J Qum 
Negro, and immigrant to history. Use a dra 
matic emphasis here to cause the children to 
project themselves into the charcters of Jews, 
Negroes and immigrants. Use facta mater 
ial presented through discussions or biography, 
leading to the dramatic expression. Make a 
great deal of their contributions in the last 
war. 

2. School G. Makea study of the contributions 
of the Negro, Jew and immigrant to literature, 
music and art. Make this factual and intellec 
tual in nature. Use biography, lecture, dis 
cussion, forums, posters, factual movie δ 
and recordings or required readings but ayo 
plays or any type of procedure that will caus 
the child to project himself into the qum d 
of the Jew, Negro or immigrant. See that e 
children know and understand what Jews, Ne 
£roes and immigrants have accomplished. 3 
Marian Anderson and others will be good ma 
terial for this project. ο 

3. School H. An attempt should be made here 

cause children to understand the scientific 
basis for group understanding. Letthem 
learn the scientific facts of race. Use also 
contributions of scientists who are Jews, Ne 
groes or immigrants. Use any or all of the 
following: biography, lectures, discussions, 
forums, posters, assigned readings, factual 
movies, and recordings. This is an intellec 
tual approach, so avoid drama. Do not ο... 
à method where the child projects himself E 
to the character of a minority group menpa 
as he would do in a play or movie or in rea! 3 
ing fiction. George Washington Carver is 3 
illustration of a person to be studied. «o 
: School J. Make a study of the Jew, ος : 
and Negro through literature. Read and ο νο 
Plays, fiction and poetry about these people i. 
but not what they themselves have written. eil 
clude plays on racial questions. Let childr 
have the vicarious experience of projecting r$ 
themselves into the minority group characte 
in these plays, fiction and poetry. a which 
up a cooperative project in W 
members of various majority and minority, 
8roups work together to make a success 0 - 

Something. The scheme followed might pe 

Similar to that described in Stewart Brown 

“They See for Themselves”, Study origins 

and characteristics of groups represent a 

your groups and have children write and ΡΣΟ 

duce a play for presentation in assembly 07 


elsewhere, Work for direct participation i" 
à cooperative effort, 


It will be noted that methods used by Senora 
G and H were- intellectual in approach; those 


September 1953) 


HAYES - CONKLIN 
29 


FIG. 4 


SCHOOL C, GROUP X 
ERIMENT 


BEGINNING OF EXP 


(Vol. 22 


Septem 
Ῥ ber 1953) HAYES - CONKLIN 


TABLE VII 


CHANGES IN ACCEPTANCE INDICES FOR EXPERIMENTAL AND CONTROL 
GROUPS BY SCHOOLS AND SEX 


C Groups 


X Groups 
Boys Girls 


Total 


School 


ee EE o B 4 


TABLE VIII 


INDICES FOR EXPERIMENTAL AND CONTROL 


CHANGES IN REJECTION 
Ps BY SCHOOLS AND SEX 


GROU. 
X Groups C Groups 
Girls Total 


School Boys 


31 


89 JOURNAL OF EXPERIMENTAL EDUCATION 


by Schools J and F utilized vicarious experience; 
School K used the method of direct experience. 

' There was an exhibit of a large number of books, 
magazines, and pamphlets, which were loaned to 
Schools for the duration of the project. An anno- 
tated bibliography 3 was also distributed. 

Evaluation instruments were of two kinds: (1) 

sociometric questions concealed in an interest 
inventory; (2) the Social Sensitivity Ballots 4, 5, 
and 11.4 These latter tests proved too obvious 
for this age level and the results were discarded. 


Analysis 


analysis was limited to the attitudes toward the 
Jewish American, th 
immigrant. 


groups as a whole 
C groups, School J showed 
X group. This is the School 


rease in rejection. 
decrease in rejection was sho 
SchoolF; there Seems to be 
here. 


Tables IX through XII pr 
alysis of the pupil’s choice 
experimental and contro] g 


esent directional an- 
S and rejections for 


and end of the experiment, 
made by separate schools s 
fered from the others in cu 
table was made for School 
the pupils were of old-Am 
were Protestant. The indices for these tab 


(Vol. 22 


were obtained by dividing the actual number of 
choices or rejections by the possible number. 
Changes were computed by subtracting indices T 
at the beginning from those at the end; plus val 
ues, therefore, indicate increases in acceptance 
or rejection, minus values decreases in them. 
Abbreviations for groups were the same as those 
used earlier in the study; for example a PC Γρ 
ceptance means that a Protestant chose a Catho 
lic as a friend. . E 
Table IX is a directional analysis of pupil aCe 
ceptances and rejections in School G for προ 
mental and control groups by sex, religious an 
cultural groups at the beginning and end of the g 
experiment. The X group here contained 18 boye 
and 15 girls. Of these 28 had non-Jewish, DE 
ican-born parents,'and 5 were Jewish Americans. 
There were 18 Protestants, 10 Catholics, and 
five Jews. The control group contained 8 boys 
and 19 girls; of these 21 were non-Jewish old- 
Stock Americans, 3 were new-stock Americans, 
and 3 were Jewish Americans; 17 were Protes 
fant, 7 were Catholic, and 3 were Jewish. The 
greatest increase in acceptance in the X group 
is the Catholic choice of Jewish friends. In 
creases in order were: CJ, CC, JJ, BG, JC, I 
BB, AJ, JA, JP, AA, CP, PP; losses were “αν 
PC, GG. Decreases in rejection, in order, Ἔα 
AJ, CP, PJ, and JA; AA, PC, Cj, BG, GG Ps 
GB increased in rejection. Increases in ix 
ance in the C group, in order, were: JC, AJ, d 
FA, JA, JP, and GB; no changes were: CJ, ^ 
BG; the losses were: BB, FF, AF, JF, FJ, J^ 
PC, CP, PP, PJ, AA, and CC. Decreases in 
rejection in C group, in order were: PC, ui tns 
BG, AA, and CP; AJ and GB increased in ΤΕ] 
tion; others showed no change. παρα 
Table X is a directional analysis of pupil ac” 
Ceptances and rejections in School H for exper? 
mental and control g 


cultural groups at the beginning and end of the 
experiment, 


, 


Schools »" Wil 


“Son Library Bulletin (way 1919)’ 


33 


HAYES - CONKLIN 


September 1953) 


000 002 "+ 000° TIO = ad 
€L0 "+ 840 7- 000 TLO’ + Ὦ9 
000° ££0'- yo+ ZLO” - aa 
000° 001 820 + FLO" = Dd 
000 700 τ 910. νου” + do 
000° Sekas 000° 890" + 29 
290 τ Z627- 550 + τ60᾽ + od 
810 “+ TLO “+ 000° 870° + dd 
ν΄. Na 000° 008^ + aN 
Sen Œ τοσα 000° 210° + YN 
TO T 000° 006" + Na 
000° 190 `- 000° 150” + να 
ον. δ οσα 000° 000 ‘T+ aa 
ues © 2 NEC 000° 000° - NV 
000° LTO τ 100 - 660” - AV 
££0'* 610'* L20 °* απο” + VV 
fou "00V “fay οον | sdnou5 
ἄποαρ 0 dnoio X 


SdnOu5 TVYNLTNO ANV 
ΘΠΟΙΟΙΠΞΗ ‘XAS Ad H 'IOOHOS NI SdNoud 
TOULNOOD ανν ΠΥΙΝΒΙΛΠΗΠΩ͂ΧΗ YOA NOLLOSÍSH 
ανν S2NVILd32OV 'IIdüd 30 NOLLOSHIG NI SHONVHO 


X 4T4VL 


1003 y00'* 620 `- ad 
STO °- 920 `+ 600 `- 99 
£I0"- 000 ` 790 "+ ρα 
000 ` 959 `- 940 “+ aa 
000 ` LYO `+ 090 `+ 9: 
000° 600 `+ ££0'* df 
000* 000* PLT + {9 
800 `- 870 `- LTO” + do 
000" y10'- Lop 29 
160 “- 811- : 680 `- fd 
LG0 - £60 `- A 090 `- od 
000° Sv0'- " eI0'* dd 
000° Boers ὋΝ ΤΉΝ Af 
000° 910; j 9£0'* ΥΕ 
000° 997 `- 7 ost "+ £f 
000° Qu ἡ o ο... fX 
000° Hore | ce o ΤΗΝ Va 
000° 0085 | =) ο Ax 
910” 060: t Le0°* fV 
000° το. do e ο... av 
£10'- SIO" EQ "+ NN 


"few "oov 


sdnoiy 9 


φάρος» ΠΥΒΠΙΠΠΌ ANV 
‘SNOIDITAU ‘XAS Ad Ὁ 'IOOHOS NI SdNOud 
'IOHLNOO ανν TIVINSWIHNdXH YOA NOLLOSfWH 
ανν FONVLdAOOV TidNd AO NOLLOSHIG NI SSONVHO 


XI TTYL 


JOURNAL OF EXPERIMENTAL EDUCATION 


34 


000 A400 τ 100 160 `+ gD 
000 ' 690 τ 8-60 ' 950 “+ DD 
900 Ἱ- $00 Ἱ- 000° GTO τ Dd 
000° 800 `+ 900 ' 960 τ gg 
ES Naa 210° 000° jr 
αρ. «Ὁ 000 ` 000° dr 
πω ct 110° 000° ΓΩ 
400 '* 960 “+ 600 ' 860 “+ do 
£00 `- $00 τ 000 ' LAO + OO 
ee ee 000° — ρα 
000 ' 050 '* 800 ' 910 T od 
720 `- Τ10 ‘+ PTO ᾿ 000 ' dd 
των gr DTI. 000° 000° ar 
emg O agais 001: 000 ` Vr 
a2 i, μι 000 - οςσσ uz fa 
000° 000° 000 ' GoL * V 
000 ' LEO Ἢ 000 OOE “o AA 
ο C o ο. 000 ' PTO- ty 
000° £50 "+ 000 ' 190+ Ἵν 
TOO τ 900 `+ eIO ' OIO '* VV 
aa ee i. 
‘fay ου ‘fay ‘o0y sdnory 
T ne: ns |): SEES. AE 
dnoiy5 X 


SS—.e“006—@O ο ὃὃ οἱ ἧὔῄἧυ- ο aaaŘĖ 


SdN0UD TVYALTNƏ ανν 
SNOISITAY ‘KAS AG M 'IOOHOS NI SdNOUD 


'IOH.LNOO ανν 'IVLNOWIIHSdXS ΠΟΙ͂ NOLLOSÍÉSH 
ανν SONVILd322V 'IIdüd AO NOLLOSIHIG NI SHO2NVHO 


ΠΧ M'IH V.L 


0560 `+ 050 τ 000 ` 160 τ go 
550 τ 886 + 600 `+ 961 + OD 
000 ` 000 ` 000 ται + Dd 
G00 `+ EPL t 900 τ 86Ι `+ dad 
ESO —— Kex» 000° OGZ `+ ΩΡ 
eee O ë Queri 000° 000° df 
bebdEd — —— mesi 000 GLE `+ LD 
000 ' 780 `+ 000 901 + dO 
000 ' Ρ80 `+ 000 96] τ ο 
RAE |. uii 000 ϱΡΙ t+ fd 
960 Ἱ- .£80 '* 600 `+ 9cI + od 
1Ρ0 * A460 ~ $00 `+ LLO ‘+ dd 
Sie à Ret 000 eges ΓΝ 
000 ' 99T “+ 000 ' LOT τ AN 
PPO + PLO `+ 000 ' 486 + VN 
000 ' 000 ' 000 C6. `+ NN 
YVES @ 5 Sraa 000 000 ` Ne 
T Cnm 000 000° are 
aa eo 000 P80 “τ VI 
000 ' "€80 `+ 000 99T = NA 
a: i ins 000 000° n 
610 `+ 060 τ GPO `+ 806 - γα 
000 ` 000 ` 000 DOT = AA 
PIO + 090 `+ 000 €60 `+ NV 
Rit T "PAX 000 188 "t Iv 
000 ' 860 `+ GHO `+ £80 `+ AV 
L00 ‘+ ott `+ 000 GOT “+ VV 
fay ουν fay ουν sdnory 
ἀποαΏ 3 dnoir) X 


Sdíf1OHD TVHNLTINO ANV 


SNOIDITAY ‘XAS χα H TOOHOS NI SdNOUND 


'IOH.LNOO ANV ΤνΙΝΠΙΛΠΗΠάΧΗ ΠΟ NOLLOALAY 
ANV NONV.Ld322OV 'IIdüd AO NOILLOFYIC NI SHDNVHO 


IX Md'IgV.L 


September 1953) 


AA 
n een ὃ and BB increased in rejection; others 
να ο... In group C, increases in 
GER 2 in order were: GB, BG, PP, AA, AF, 
GG, = Lorca in acceptance were: PC. CG 
ας ώς BB. There were no decreases in 
Pe in Group C; G, PC, AA, and PP in- 
oe in rejection. 
qu — a similar directional analysis for 
titii ^ his school is particularly interesting 
ead was more heterogeneous from a cul- 
ndpoint than any of the other schools. Al- 


SO 
, most of the pupils were of limited intelligence. 


d . 
= ας. level is quite low. The X group 
um ο η... follows: boys 1s, girls 11; chil- 
Stock ᾽ν ον American parentage 12, new" 
Vei I ai oor 2, Jewish Americans 1, Negro 
ershi Id 9; Protestant 14, Catholic 8. Mem- 
10. ον νὰ C group was as follows: boys 14, girls 
17 ^ ren of non-Jewish American parentage 
4; | ts ii Americans 3, Negro Americans 
number μετ. 9, Catholic 4. The considerable 
School J. egro children were well accepted in 
oo in acceptance in Group x in order 
GG M J, NJ, NA, AJ, JC, NN, NF, BG, PJ, 
oo ee pc, AA, ΑΝ, GB, JA, ΑΣ, 
creases changes were: JP, JN, 
πα Y FA, FN, and FF. 
FA cow. in rejection in grou : 
"ee PC, BB and PP increased in rejection. 
GG BR in acceptance in Group : 
, , AA, NF, AF, CC, CP, PC, F 
G: rs and FA; no changes Were: , NN, : 
ion decreases, PP. The only decrease 1n rejec- 
os Group C was PC; groups increasing in re- 
and Lo ΝΑ, ΡΕ, GG, B, FA, 

T XII is a directio 
μον X group in this schoo 
can S: boys 13, girls 11; non-Jew 
ish parentage 19, new- | 

Americans 3; Protestant 7, Cat 


arentage 23, new- 
15 Catholic 19. 
order 


BB, PC, 


? 


nal analysis of School 
] was made up as 
ish Ameri- 
Jew- 


i ee in acceptanc 
Na o CC, GB, AF, GB, CP, 
ipi e AA; decreases Were: 
or th were unchanged. Decrea € 
ος X group were: JC, JA, CP, and BB; in 
C es in rejection were: CJ, GG; 
ους... GB; others remained unchanged. In- 
AF es in acceptance in Group C were: PE; 
AF, CP, FF, PC, AA, BB, GP, ος ag; Bape 
Der Sed in acceptance; FA remained unchange®- 
and oe in rejection in Group C were: PP; BG, 
Μον C; increases were CP and AA; others re- 
ained unchanged. 
κοΐ» study of these tables reveals 
ol a larger number of people i 


that in every 
n the X group 


HAYES - CONKLIN 


35. 


4. 
increased t — i be than in the 
corresponding 'C'glow,..  lecreasing in- 
dices of acceptance was Conv- j the groups 
showed little difference. Also nu group differ- 
ences were apparent in decreasing indices of re- 
jection, or increasing them. However, indices 
of rejection showed a tendency to rise; there may 
be a factor of outgoingness here which shows it- 


self in both acceptance and rejection. 


Findings from the Second Year Project | 


Tentative conclusions may be drawn from the 
second year project: 


1. The instructional approach by vicarious ex- 
periences through the 8 tudy of literature 
seems more effective than other methods in 
increasing the amount of acceptance within 


groups. 


2. The experimental teaching seemed, on the 
whole, to have two effects: (1) it increased 
the degree of acceptance in the group asa 
whole; (2) it increased the amount of accept- 
ance cutting across majority-minority group 


lines. 


SUMMARY AND IMPLICATIONS 


ergroup attitudes can 
es of teaching. 


The most promising technique 2mong those test- 
ed for changing intergroup attitudes in favorable 
directions seems to be that of vicarious exper- 
ience; its superiority over the method of direct 
experience is probably due to the fact that it is 
easily manageable and direct experience is usu- 
ally not; also it is very difficult to make direct 
experience realistic. 

Majority -minority group relations seem to be 
α part of a larger, overall factor of outgoingness; 
raising the level of acceptance in one area and 
t attitudes in other areas. 

1 background of the group is a fac- 
tor in the reaction of the group to experimental 
techniques for changing intergroup attitudes. New 
stock Americans and Jewish Americans made 
larger gains than old-stock Americans. 

This study suffered from some handicaps, not- 
ably the limitations of present evaluation instru- 
ments and the difficulties met in securing favor- 
able real experiences of a significant kind involv- 
ing members of different cultural groups. Chil- 
dren are influenced by varied factors intheir 
attitudes toward members of other groups and 
react selectively to educational stimuli accordin 
to their sex, cultural backgrounds, age and μπα 
Progress lies in developing more valid 


It seems clear that int 
be improved 


ligence. 


36 * ` JOURNAL OF EXPERIMENTAL EDUCATION (Vol. 22 


instruments of evaluation and using them to test is a corresponding need for caution in interpreting 
varied techniques, among more heterogeneous results. The best that can be hoped for is that 
groups, and in more realistic situations. It has some useful clues have been found for the guidance 
‘been pointed out that the handicaps under which 


ἡ i of teachers and that the participants in the project 
one works are especially great in this field. There have developed more social sensitivity. 


THE RELATIONSHIP BETWEEN THE SOCIAL 
STRUCTURE OF THE CLASSROOM AND 
THE ACADEMIC SUCCESS OF 
THE PUPILS 


MARGARET M. BUSWELL* 
Iowa State Teachers College 
Cedar Falls, Iowa 


Statement of the Problem 


nee PUR POS E of the present investiga- 

Ten wh 0. determine whether or not those child- 
tain Ee ee accepted by their peers differ incer- 

A levements from those who are rejected. 

Show TOUS study of almost any group will 
ed whi at some members are consistently reject- 
ile others are given most of the attention, 


Positions of leadership, and other responsibilities. 


πα on the outskirts of a group is not 
tential rmful to the individual, but limits the po- 
arbi help that this individual might give to the 
ent i if he were an accepted member of it. Pres- 
fheories in regard to group dynamics indicate 
aes is not necessary for groups to be 50 con- 
of the ed. Under proper leadership, all members 
a e group should become partici 
me group is so constituted as to include some 
See who largely dominate the scene, and 
it nea who seldom enter in except as on-lookers, 
peer eee tant to know which people are in these 
s Pacities, and what they are like inother re- 
ο see Only after the leader has identified those 
Bray nid playing the various roles in the group, 
e work toward effecting a balance. j 
, According to modern philosophies of education, 
subject: teacher does more than merely dispense 
Dieu matter to pupils. One of the things she 
and umably does is to help children learn to live 
to here and play cooperatively. She attempts 
ο elp each individual develop his capacities an 
τώρα advantage of his particular talents. She 
οσα in each child something, however 
must z which he can contribute to the group. She 
ust also help him find the opportunity to make 
ο S contribution. Some children find something 
d id to the others without too much specific 
D from the teacher. But it is very likely that 


* 
us writer wishes to express sincere app 
guidance and advice. 


l. For a discussion of the contribution of several investigat 
and metho! 


ta Library. 


general, including reliability, validity, 
hesis ^f the writer, University of lMinneso 


pating members. 


reciation to Guy Le 


every teacher will have some isolated individuals 
in her class. Itis important for her to know 
why these children are isolated and why they are 
not able to contribute to the group, in order that 
she may help improve their social relationships 

in the classroom. To do this she also needs to 
know the basis of the smaller classroom groups. 

Most studies of the social acceptability of el- 
ementary school children seem to have been so- 
ciometric in nature. Research by Moreno has 
shown that more accurate information about the 
acceptability of children by their peers may be 
obtained from the children themselves than from 
the judgment of their teachers. In checking the 
accuracy of teachers’ judgments regarding the 
most and least popular members of the various 
classroom groups, Moreno found a constantly de- 
creasing figure, from that of being 62.5% accu- 
rate in the kindergarten down to being only 2500 
accurate in the seventh grade, with an increase 
to 40% in the eighth grade (11). If teachers, know- 
ing pupils as wellas they do, are unable to learn- 
from observation how the group is structured so- 
cially, it is evident that some more accur ate 
measure is necessary. 

Several studies have reported results which 
indicate the degree of stability of sociometric 
measures. There is a fairly wide variation in 
the results obtained. At the one extreme are the 
results of Stacker who found at the fifth grade 
level a correlation of . 87 between a sociometric 
retest and one given five months earlier (13). At 
the other extreme; Criswell finds a fluctuation 
of from 20% to 50% in the choice of ones’ best 
friends over a six-week period in grades 1 to 8 
(3). However, even with this great degree of 
fluctuation, no significant changes in the general 
structure of the classroom were found. 1 
The majority of these studies have been in the 


Bond, University of Minnesota, for 


ors toward sociometric measurement in 


ds of interpreting the results, see Ph.D 


38 JOURNAL OF EXPERIMENTAL EDUCATION 


general area of personality factors with achieve- 
ment and intelligence playing minor roles. Asit 
is hardly feasible to tackle all phases of the prob- 
lem at once, in the present study an attempt was 
made to focus attention on the aspect of achieve- 
ment. A few other investigators have also taken 
this approach. McLendon found a direct relation- 
ship between achievement as measured by the 
Stanford Achievement Test given in the fifth and 
sixth grades, and social acceptability when she 
compared the extreme cases (9). In the inter- 
mediate grades she gave the California Test of 
Mental Maturity and again merely through observ- 
ing the extreme cases, she found intelligence to 
be related to acceptability. Mitchell found a re- 
lationship between enjoyment of reading and social 
acceptability in a sixth grade class ina New York 
community (10). Nora Loeb seems even more 
convinced of the relationship between the accept- 
ability of the pupil and his achievement (7). She 
implies that the achievement is a cause of the ac- 
ceptability, for she Suggests the hypothesis that 
‘The skills that the culture itself regards as im- 
individual to participate 
in it effectively, are also regarded by the child- 


ere more nearly alike 
So far as academ- 
d, in all grades the 

be more alike than the 
nces were not signif- 

In another journal Bonney reports further 
results of his study wherein he correlated social 

acceptability with reading ability and with I. Q., 


ound coefficients vary- 


5 ; and fourth grades, 
No tests of significance were reported. 


Young sees the other side of the picture after 
giving the Ohio Social Acceptance Scale and many 
other measures to a class of Seventh graders. 
He says, ‘‘As might be expected, the sociomet- 
ric array of seven criteria correlated negatively 
with intelligence quotients and only slightly with 
achievement test grade placements, Expressed 
in another way, this study bore out the findings 
frequently reported by students of the Social psy- 
chology of children that social status is not asso- 
ciated to any extent with a child's intelligence or 
his school work. ’’(14) He does not name the stu- 
dents who have made this discovery, butit would 
indicate a possible bias in the findings of both ed- 
ucators and sociologists when they come to such 


(Vol. 22 


Opposing conclusions. 3 

In the hope of throwing light on controversies 
such as this, the present investigation was under- 
taken. With the general aim of looking at the so- 
cial acceptability of children with a focus on their 
achievement, the social structure of two different 
age groups of school children was studied by 
means of the sociometric technique in order to 
test the following null hypothesis: 


There is no relationship between the so- 
cial structure of a classroom group and 
the achievement in some of the basic el- 
ementary school subjects of the members 
of the class. 


The general problem was to search for a re- 
lationship between achievement and social accep" 
tability. A secondary problem, upon the solution 
of which the primary problem was dependent, was 
to find a good measure of Social acceptability. 
Present scales and techniques are still not very ἃ 
good, and are in need of much refinement. There 
fore, there was an attempt to check the validity 
of these measures and to find one that could be 


considered an acceptable measure for the pres- 
ent study. 


During the study of this problem there was an 
attempt to answer several questions. First, do. 
the children of the most accepted group and those 
of the least accepted group differ in the achieve- 
ments to be measured? If so, can this difference 
be accounted for by the intelligence or by the so 
cio-economic status of the group, or is the dif- 
rerence significant even when these factors are 
held constant? Second, if there is a relation- 
Ship between social acceptability and achieve- 
ment, is it stronger at the end of the school year 
When academic success is more prominent in 
every-day life than after a summer vacation when 
children tend to have developed different bases 
for Choosing their friends? Third, do children 
tend to join together in clusters on the basis of 
any of the factors tested? That is, do children 
prefer friends who are more like themselves in 
Social acceptability, in academic ability, in in- 
telligence, or in Socio-economic status? And 
fourth, will the relationship, if any exists, be 


tronger in the sixth grade than in the kindergar^ 
en? 


discover some fa 


the degree to whi 


Work and the degree to which he is accepted by 


his peers—which is, indeed, also a measure of 
achievement. 


ES T oS 
- D Āe 


September 1953) 


Design of the Investigation 


The present study involves an analysis of 50- 


ci eT 
ial acceptability of children at two different grade 


le 
Hg ge group of children was studied in kin- 
followin; qe the spring of 1949, and again in the 
This g fall when they were in the first grade. 
ο a will be referred to as the kindergar- 
The seco erode group, or as the lower grades. 
ο group was AUN in the fifth grade 
fall in να. of 1949 and again in the following 
ferred o sixth grade. These pupils will be re- 
upper gr i the fifth-sixth grade group, or the 
and i λα es. By the use of analysis of variance 
member riance, the socially most acceptable 
PE S of each group and the socially least ac- 
Order i oe compared on several factors in 
Soci ο determine what may be associated with 
lal acceptability. 


Se 1 
lection and Description of the Sample 


It was decided to use the city of St. Paul, Min- 


ne 
dun as the general population from which to 
dicated pes le. The results of a pilot study in- 
Pupils „At ροκ λα 250 and 300 
o trac HH give a stable population upon which 
Sime iue σα, Therefore, since it was in- 
lost bef that a certain number of cases would be 
grade ας the follow-up in the fall, enough fifth 
à popul mais in St. Paul were selected to give 
© be Eis lon of somewhat over 300. The schools 
Of the ren were chosen at random by the use 
dom wr endal and Babbington Smith Tables of Ran- 
fifth umbers (6, p. 194). A total of eleven 
Used &rade classrooms in eight schools were 
lation This made up the total fifth grade popu- 
Populatic each of the schools. The kindergarten 
ers in ion was composed of all the kindergarten- 
tye eight schools selected. 
Spring number of cases in the fifth grade in the 
id not Was 358. Thirty-seven of these children 
the up; return to school in the fall. Therefore 
321 ες aa grade population studied consis ted of 
Ound ter 145 boys and 176 girls. Since it was 
Sach of s the mean score in intelligence and in 
for the 3T. achievements measured did not differ 
Assum cases from those for the 321, it was 
the s ed that the loss of these cases did not bias 
ample. 


I fi 
n the kindergartens in the spring there Were 


ma; 
es. The 


i ts re cases than in the fifth grad 
pe fis rtens in two of the schools had 
€avin, m the study because of incomplete data, 

g in the fall a total of 286 children, 139 boys 


to be drop- 


BUSWELL 
39 


and 147 girls, of the younger age group. A com- 
parison of the mean reading readiness scores 
(used at this level as the estimate of future a- 
chievement) for the schools dropped and the 
schools retained again showed no difference and 
therefore presumably an unbiased sample still 


remained. 


Instruments Used 


Upper Grades 


Intelligence. —As a measure of intelligence 
in the upper grade group, the Revised Stanford 
Binet Test was used. The superiority of this test 
over group tests as a method of estimating intel- 
ligence is undisputed. The validity of the instru- 
ment in this particular case depends upon the ac- 
curacy of the student examiners who administer- 
ed the tests. On another occasion the results ob- 
tained by a similar class of student testers were 
compared with those of experienced examiners 
and found to agree closely (8). This indicates 
that it is reasonable to assume the I. Q. 5 obtain- 
ed in this study to be adequate estimates of the 
true intelligence quotients of the pupils. 

Achievement. — The Iowa Every Pupil Tests 
of Basic Skills 2 were selected as the means of 
measuring the achievement of the pupils, since 
these were considered by the investigator to bethe 
best tests of general understandings and skills. 
The norms were established in the Middle West 
and should therefore be applicable to these pupils. 
In addition, reliability and validity are consider- 
ed to be adequate. The sections of the battery 
which were used in the present investigation were 
those testing reading, arithmetic, and basic study 


skills. 

Socio-Economic Status. — The Sims Score Card 
for Determining Socio-Economic Status 3 was used 
to give a general indication of the type of home 

hich the pupils came. Sims in- 


background from W } 
dicates that all scores derived from the inventory 
are relative and are best compared with other 


scores in the same community rather than with 
ther communities. So long as St. 
re being compared with other 
the results are adequate for 


— The most questionable instru- 
the study were those for determin- 
ing the acceptance of a child by the other mem- 

bers of the group. These measures, besides be- 
ing the least adequate, were the most vital to the 
f the study as a whole, for it was on the 


ments used in 


success 0 


ie 
Hones ae by E. F. Lindquist and others, 
& Cosy Boston, 1919. 


3. c 
"esi by V. H. Sims, University of 
s Blcomington, Illinois, 1927. 


State Universit 


Alabama. Distributed by the Pub. 


y of Iowa. Distributed by Houghton 


lic School Publishing 


JOURNAL OF EXPERIMENTAL EDUCATION 
40 


basis of degree of social acceptability mA re 
rious comparisons were made, As e fie 

n cial measurement is a comparatively new 

ro the measures need considerable refinement, 
aue measures chosen for this study Seem to be 
as adequate as any for the use to which they were 
oe Ohio Social Acceptance Scale 4, which was 
used in this study, is a device whereby each indi- 
vidual in the class rates each other individual on 
a six-point scale ranging from a rating of SIN, 
entitled ‘““My very, very best friends” to a rating 
of “6” entitled ‘Dislike them, ” Louis Raths, in 
an article on the validity of this Scale, states 


has not been disproved (12). 
al consistency due to the fact 


friends, ? 10 to “My other 
to zero for ‘Dislike them, 32 


ment were attempted, They 
low, pages 41, 43. 

Two other measures of degree of acceptabi]- 
ity were obtained, and although they were n ot 
used in the final analysis, they were used as a 
rough means of validating i 
of these measur 
ted by the investigator, The fir: 
of friendship, and asked that ea 
down, in order of preference, € names of the 
three people in the classroom wi 


(Vol. 22 


that it would take many specific questions to give 

id over-all picture. VE 
j ΤΣ the quéssnnidres and on the Ohio A 
cial Acceptance Scale, the identity of the UN 
was concealed so far as the pupils themselve 
Were concerned. That is, in order to oe κα 
truthful appraisals of their classmates, ee 
not asked to sign their names, However, sin i 
for research Purposes it was necessary to kaon 
which paper belonged to each pupil, the Da d 
were coded and distributed in order, so tha 
names could be put on each one later. 


" : jeve- 
These measures, then, of intelligence, achi 


ment, socio-economic Status, and ἐν d 
Supplied the data around which the analyses 
Upper grade level were designed. 


Lower Grades 


Intelligence. —Since the Detroit Beginning 
First-Grade Intelligence Tests 5 are given is 
every first grade teacher in St. Paul each Mx 
it was decided to use the results of these as P 
estimate of the intelligence o; the younger τ 
of children in the Study. This is à paper-an 


The te pP 
is non-verba] 80 far as reading is concerned, of 
the children must follow the verbal direction 


the teacher, Although adequate norms seem to 
be given, based on a lar 


in the present study, 


viduals, The 
used as the inv 
means for predicti 
iness tests on 


il- 
Acceptability, —Both at the time when the oe 
ren Were in kindergarten and when they wer 


the first grade, their acceptability was αν 
ined by individua] interview, They were aS d 
“Of all the boys and gi 1 


i e 
; the request was made, ‘Now tell me t 
the names of two other children in your room 
l. Prepared under the direction of the Ohio Scholarshi; Tests 
vision, State Department of Education, Col : and the Division 


umbus, ϱ) 


6. Constructed by Arthur I. Gates, Distri 


buted by Bureau of Publi 
unbia University, New York, revised 1942, cations, 


hio, 1946, 


Distributed by World Book Qo, 


of Elementary Super- 
» Yonkers-on- 


Teachers College, Col- 


September 1953) 


ou i nm 
ος much. ’’ In the final analysis all 
Olces were given equal weighting. 


Dates and Administration of Tests 


fifth pe Spring testing at both the kindergarten and 
1949 levels was done in May and early June, 
first aie e follow-up testing was done during the 
the fari oponen school in the fall of 1949. In 
io be i Sociometric tests were the last ones 
er grade os in order that, particularly at the low- 
to know e eL the children have an opportunity 
the zene E Other better. Figure 1 indicates 
testing ra Schedule for collecting the data. All 
Which 2 reap in the case of the Detroit tests 
carried re given by the first grade teachers, was 
versit out by the investigator assisted by Uni- 

y students trained in testing procedures. 


Selection of Groups for Analysis 


ad most important step in the analysis ofthe 
as determining whether or not a child should 


be À 
placed in the Accepted or in the Rejected Group. 


As 
Brows ie es were collected on the upper age 
ed first, S respect, this level will be consider- 
eee Grades. —Here there were several pos- 
lookin eans of analysis. One way would involve 
ade ee the first, second, and third choices 
room, in response to the question, ‘Who in this 
of a like best to do things with in and out 
irst d |?” It would be possible to look at the 
ee Kee only, or.to look at the total of a 11 
each of ices, or to make a set of weightings for 
ing all the three choices. It was found that total- 
result: three choices gave approximately the same 
ever S as did using the first choices alone. How- 
ies. using only the first choices produced more 
Pupil making decisions difficult regarding which 
το S to include in the Accepted and R ej ected 
ups, 
teen leadership data were used in correlation 
this ne to investigate the relationship between 
appro ctor and friendship. Had the correlation 
i ee unity, it would have appeared that the 
Ρ6οΡΙ en were recording as ‘‘friends’’ really those 
as ihe whom they admired as leaders. However, 
it is κο πέρ correlation coefficient was . 639 
Not heal gael a distinction was made. Thus 
meng dership, but friendship, is what is being 
leaders ed. With this question answered, the 
ysis Ship data were no longer used for the anal- 


ani ifion to the several possible means of 

Mike τ the data on the sociometric question- 

in soma the Ohio Social Acceptance Scale should 

above, pesi beanalyzed. As was stated 

ing arbit ere seems to be no justification for us- 

aa ds rary weightings. Another question EN 
ut of the use of all six degrees on this scale 


BUSWELL : 41 


is that of the human variable. Can the fifth and 
sixth graders accurately distinguish between the 
others in the room in six varying degrees? It 
was decided that a more valid measure would be 
obtained by using only the extremes of the scale 
considering as ‘‘Accepted’’ those most often chos- 
en as ΜΥ very, very best friend", and as ‘“‘Re- 
jected’’ those with the most votes in the ‘‘Dislike 
them" column. Rather than simply taking the 
total number in each of these two categories, two 
adjustments were made. The first was in the 
case of children who gave a rating of ‘‘6’’ (Dis- 
like them) to no one. In the minds of many indi- 
viduals, as well as in some classes in general, 

it is not considered ‘nice’ to say that one does 
not like a person. For the people who feel this 
way, the most extreme form of rejection was ex- 
pressed by the ‘‘5’’ category, stated as ‘‘Don’t 
care for them". Therefore, if no sixes were giv- 
en, the fives were counted. However, when 
neither fives nor sixes were given it was not 
deemed just to count the fours or the threes, as 
it is possible that a person might dislike no one 
in his class. In general, it appeared that those 
not using the “‘5’’ and «6? categories were the 
best-liked people, and it would seem reasonable y 
to believe that they did not really dislike anyone. 
A similar adjustment was not made at the posi- 
tive end of the scale, as raters here have not 
been found to avoid this extreme to the extentthat 
they avoid the negative. 

A second adjustment was made since no re- 
striction was put on the number of names that 
might be placed in each of the six categories and 
some pupils were much more generous with their 
ratings of ‘‘1’’ or ‘‘6’’ than were others. There 
were boys, for instance, who automatically gave 
a “6” to every girl in the class. Getting a “6” 
from such a boy would imply the same degree of 
rejection as would receiving a ‘‘6’’ from some- 
one who gave only one ««6’’ to the entire group. 
Therefore, for each person, there was calculat- 
ed the value that his «1” and his ‘‘6’’ had when 
given to a boy and when given to a girl. This was 
done by assuming that the ratings should be dis- 
tributed normally and calculating the percent of 
each rating that should be expected. By the use 
of the normal table, these percents were then 
converted into weightings, so that when few of 
either value were given, the rating would be 
weighted heavily, while if many were given, they 
would add little to the total score. This method 
of normalizing is described by Holzinger (4, pp. 
221-224). This process of obtaining separate 
weightings for the ratings given to boys and to 
girls made it possible to combine the sexes for 
further analysis. When these weights had been 
calculated for choices as given by each individual 
the total score received by each was computed i 
The Accepted Group was composed of those in ‘the 
upper 27% as based on the ratings of aj» and 


JOURNAL OF EXPERIMENTAL EDUCATION 


42 


uonvejsruturpy 1591, jo eui, 


| p egnSrA 


'99Π9Ι9ΦΊΙΠΡ ou YEW PINOYS SIY} 'suorje[no[?o ƏY} ur posn sea ‘V I Uey} euer T 


SV τι SUTMOJIOJ ƏY} UT U9AIS Θ194ι 51591, JƏUIg Ə) ‘SƏSSLJI τι9Λ919 ƏY} Jo 99ατῃ τη. 
SSS SS ss Áo eÓÀ— 


(191116) 
ΘοτιοοΤΠθΊτῃ 
(surs) (emor) 
Snj?jg oruiouoo5 -oroog jueuigAargoy 
(Se1rvuuorjssno) pue orqo) (se1reuuorjsen? pue orqo) 
Λγπταε]άθοον Αγπτατ]άθοονγ 
9 - 6 99Ρἑ19 
(59160) 
5ΦΘΙΠΡΟΟΝ Surpvay 
(110.138 Q1) 
oouoSI[T93u] 
(M9TA.19}UT) (4ΘΤΑ.ΙΟΊΙΠ) 
Α1ΠιαὉ]άθοογ ΔΙΠιατ]άθοον 
Į 9pea5-uoj1v3Jopury 


Ἑ--ααααα ιο ο .. 


September 1953) 


t ; 
iced Group, of those in the upper 27% as 
interesti the ratings of ‘‘6’’. This introduced one 
groups lo b factor not encountered when the two 
ite extre e compared are taken from the oppos- 
Was pos able of the same continuum. Here it 
accepted ible for one individual to be in the most 
ed. Aft group and also to be in the most reject- 
this is a moment’s thought it will seem that 
viis understandable, We have all known indi- 
idolized y e extreme personalities, who are 
neutral p^. espised, and about whom no one feels 
number in Videns i these individuals, five in 
cluded in b e spring and six in the fall, were in- 
86 pupils both groups. Altogether, there were 
the Re; in the Accepted Group and 86 pupils in 
Pea το. Group both spring and fall. 
is, eee the groups had been selected on this bas- 
metric nvestigator then returned to the socio- 
in the κο estionnaires to see who would have been 
chosen er e groups if only total number οἱ times 
prisine d een the criterion. There was a sur” 
then S B niv" of similarity in the two lists. «Why 
πον Me might ask, ‘‘should such a lengthy 
numbe Ing process be used, when counting total 
give m first, second, and third choices would 
advanta Stantially the same results?’’ The chief 
ing ti ge is in having a definite basis for break- 
es. In choosing the 27% in a class, Say, of 
n o where nine would be needed for each 
pasa hte might be seven who had definitely 
five “me : more choices than any of the others, but 
the os e might be tied for the last two places on 
be the ς In many cases these first seven would 
the w Same as determined by either method, but 
eref ighting system is more discriminating and 
6 πο... ties do not occur. This makes 
n oosing of the groups less arbitrary. Ἢ 
data τ. Grades. — Checking the acceptability 
Scale y two methods, Ohio Social Acceptance 
vantages, Sociometric Questionnaire, has two ad- 
betwe es. As stated above, the close agreement 
validity the two measures lends an indication of 
garte ty to the data. In addition, at the kinder- 
ata *» -first grade level it was not possible to get 
to kn y means of the Ohio Scale, and it is £09 
irst y that using the data obtained by to ta ling 
tially pM and third choices gives substan" 
t ilie? same results. Even when itis decided 
tifiabl em number of choices received can jus- 
age le I used as indicating acceptability at this 
encounte , the problem of breaking ties is still 
Set upo c Therefore a definite criterion Was 
ο i s which to base the inclusion or exclusion 
ed Coe After the groups had thus been select- 
in the = R 76 children in the Accepted Group 
fall, df ae and 80 in the Accepted Group in the 
advisabl his younger age level it did not seem 
mation ab SER the children for negative infor- 
Rejected ο their classmates, so instead of α 
roup at this level, the group will be 


BUSWELL d 


called Ignored. There were 81 children in this 
group in the spring, and 77 in the fall. 

With the groups chosen (an Accepted and a Re- 
jected Group for the fifth grade in the spring and 
for the sixth grade in the fall, and an Accepted 
and Ignored Group for the kindergarten in the 
spring and for the first grade in the fall) it was 
then possible to continue the analysis. 


Summary of the Design of the Investigation 


The present study of the relationship between 
social acceptability and achievement began with 
the selection through a process of randomization 
of a group of 321 fifth graders and 286 kindergart- 
eners who were to be studied. To the older group 
was given a battery of achievement tests, an in- 
telligence test, a questionnaire to determine so- 
cio-economic status, and three measures of social 
acceptability. The younger children were given 
readiness tests and intelligence tests, and were 
interviewed individually regarding their choice of 


friends. 

From the sociometric data, two groups of chil- 
dren were selected at each level, one group that 
was well-liked by the others and one group that 


was not. The children in these extreme groups 
of social acceptability were studied both before 


and after a summer vacation. 
ptions basic to the statistical 


tisfied, these two groups 
ared by means 


of children at eac 
of the technique of analysis of variance and covar- 
iance to see whether or no 


ed. 
Testing of Basic Assumptions 


Before the actual statistical analysis was un- 
dertaken, certain assumptions were tested. Tests 
of homogeneity and normality were made for the 
various factors. These assumptions are discussed 
in the order in which they were encountered and 
satisfied during the analysis. The reader not in- 
terested in the testing of these assumptions may 
omit this section. He may proceed directly to the 


section entitled « Analysis of the Results. "' 


Pooling the Schools 


The data for this investigation were obtained 
from eight schools in St. Paul. In order thatthe 
results from all these schools might be consid- 
ered as one group rather than as eight separate 
groups, it was shown that the populations ofthese 
schools were homogeneous. That is, before the 
data were pooled it was shown that the schools 
were not significantly different from each other 
in either variability or mean in each of the three 
outcome variables, reading, arithmetic, and work- 


44 à JOURNAL OF EXPERIMENTAL EDUCATION 


study skills. The L-Test, as devised by Welch 
(5, p. 93), was used in determining the Signifi- 
cance of the differences among the variances of 
the measures; and the F-Test (5, p. 214) was 
used in determining the significance of the differ- 
ences between the means of the eight schools for 


riance among the 
Homogeneity of 

n all outcome vari- 
In socio-economic Status 
and in intelligence at both 
grade levels the schools Were not found to be hom- 
ogeneous in mean Score. Since ina representa- 


decided that, if the tota] da 
ally distributed, it would b 
Schools were Írom the same 


Since it had been established that the eight 
Schools were sufficiently similar that they might 
be considered to belong to one homogeneous pop- 
ulation, all data were pooled for further analysis, 


of the three achievements: arithmeti 
and study Skills, 

be seen in Table I. As seen in the table, the hy- 
pothesis in each cag i 
each achievement, 


During the process of the anal 
ance it was necessary to test ano 
The analysis of Covariance is a p 
factors which are presumed to have an effect on 
the outcome variable may be statisti 
constant during the analysis in o 
ine the relative effects of the va: 
seem to be involved, At the fif 
levels the factors presumed by 
to have a possible effect on the 
ity of the children (in addition t 


(Vol. 22 


factors which were being Studied), were the fac 
tors of intelligence and Socio-economic κος 
Therefore, the technique of covariance was usi nt 
to control these factors So that their effects ur 
be studied independently. Hence, when the st R 
of the covariance analysis was reached in miigh 
these two factors were controlled, the ας 
tion was made that, even with these factors um 
constant, the variabilities of the Accepted σα "s 
and the Rejected Group were not statistically d 
ferent. Actually what was being tested in ai 
to this assumption was the homogeneity of ens) 
Sion of the dependent factors (the ώμος 
on intelligence and Socio-economic status. d 
Ο the two factors was controlled separately, τῳ 
both were controlled Simultaneously. The ar 
of this analysis are found in Table II. In ms τη 
and study skills all hypotheses were accepte ^d 
arithmetic at the sixth grade level the δ γνώμων 
F-value was between the , 01 and . 05 values WAT 
therefore the rejection of the hypothesis was E 
ful, However, as the - 01 level had been set n ^ 
the critical level, the groups were considere cl 
homogeneous. Since the hypotheses with resp 


the assumptions basi 
ance were satisfied, 

Lower Gr 
just described f 


Since at this le 
the measure of 


At this lower level there were two factors, e 
addition to the factor of readiness, which w a 
Considered to exert a Possible influence on t in- 
Social acceptability of the child. These eer 
telligence and chronologica] age. By the tes bove 
of homogeneity of regression, as explained à sic 
for the upper grades, the fina] assumptions ba e 
to the analysis or Covariance for the lower gra 
Were satisfied. Table IV gives a summary of 


results of these tests. Again it was found that 
all of the hypotheses were accepted at the . 0 
level. The 


: e 
Tefore it was Possible to continue th 
analysis, 


Analysis of the Results 


Comparison of the Accepted with the Rejected 

Group by the Process of Analysis of Variance 

and Covariance “C of Analysis of Variance 
ee 


In the upper Srades, for each of the three 
achievements (reading, arithmetic, and study 


ny 
.-ς 


Septembe 
siens: BUSWELL 


TESTS I 


SUMMARY OF TESTS OF HOMOGENEITY OF VARIANCE 
IN VARIOUS FACTORS FOR THE ACCEPTED AND 
REJECTED GROUPS IN THE UPPER GRADES* 


Measurement Fo Hypothesis 
Spring 
Reading 1.00 Accepted 
Arithmetic 1.23 Accepted 
Study Skills 1.04 Accepted 
Fall 
Reading 1. 00 Accepted 
Arithmetic 1.39 Accepted 
1.14 Accepted 


Study Skills 
MEUM NM Li c7] 


Fg, = 2.58 
*For complete tables, see unpublished Ph.D. thesis of the 
writer, University of Minnesota Library. 


TABLE Π 


TY OF REGRESSION OF THE AC 
IN THE UPPER GRADES 


CEPTED 


SUMMARY OF TESTS OF HOMOGENEI 
AND REJECTED GROUPS 


gression of Regression of 


ssion of Re 
πας Arithmetic Study Skills 


Measure Reading 

ment Fo Hypothesis Fo Hypothesis Fo Hypothesis 

Spring od 32 
Intelligence 1. 05 Accepted 1.26 Accept 1.32 Accepted 
Socio-Economic 1. 02 Accepted 1.24 Accepted 1. 08 Accepted 
Both of the Above 1. 06 Accepted 1.28 Accepted 1. 38 Accepted 

Fall 
Intelligence 1. 02 Accepted 1.47 Doubtful es 
Socio-Economic Status 1. 06 Accepted 1.61 ποσά dc enel 
Both of the Above 1. 00 Accepted 1.52 Doubtful 1.35 Accepted 


4i JOURNAL OF EXPERIMENTAL EDUCATION ( 
6 


TABLE III 


OF 
SUMMARY OF TESTS OF HOMOGENEITY 
VARIANCE IN READING READINESS OF THE 
ACCEPTED AND IGNORED GROUPS 
IN THE LOWER GRADES 


Fo Hypothesis 
Spring 1.36 Accepted 
Fall 1.61 Doubtful 
TABLE IV 


SUMMARY OF TESTS OF HOMOGENEITY OF RE- 
GRESSION OF THE ACCEPTED AND IGNORED 
GROUPS IN THE LOWER GRADES 


Regression of Reading 


Readiness 

Measurement Fo Hypothesis 
Spring 

Intelligence 1.09 Accepted 

Chronological Age 1.37 Accepted 

Both of the Above 1.09 Accepted 
Fall 

Intelligence 1.23 Accepted 

Chronological Age 1.60 Doubtful 

Both of the Above 1.23 Accepted 


TABLE y 


TEST OF THE DIFFERENCE BETWEEN THE MEANS R Ë 
ACCEPTED GROUP AND THE REJECTED GROU © MH SCORE OF TH 


ROUP IN E FIFTH GRADE 
Source of Sum of Mean 
Variation D.F. Squares Square F Hypothesis 
Between Groups 1 3,709. 698 


3,709. 698 14.147 Rejected 
43, 528. 820 262. 222 
Total 167 


47,238. 518 


Within Groups 166 


September 1953) 


Skill 
pte eism Scores were calculated for the 
Significanc oup and for the Rejected Group. The 
was then t € of the differences between the means 
Similar Ae These tests were followed by 
the factor Sts to determine whether or not, with 
Status held of intelligence and socio-economic 
between Spo there would be a difference 
lance were -- means. Two final tests of covar- 
(LQ. ands Li controlling each of these factors 
mine the eff .8.) separately, in order to deter- 
of the two ect of each on the mean achievements 
to be the Pe ue Since direct comparisons seem 
the analysis πλ οὐ Table V indicates how 
Summarizes m been carried out and Table VI 
Table VI indi e results of the tests. A study of 
be well to 1i icates several facts which it would 

i. At E before attempting interpretation. 

e two ο... with no factors controlled, 
nificantly dilfer, groups in acceptability were sig- 
In favor of th erent in mean achievement, always 

9 this state e Accepted Group. One reservation 
arithmetic ment should be made in regard to the 

e region ant where the difference was in 
Proached i oubt. However, the F-value ap- 
level, it wou 1% level. Were it to reach this 
Group Pa on io be concluded that the Accepted 
tic, as in unlike the Rejected Group in arithme- 

: nen other achievements. 
tus were e intelligence and socio-economic sta- 
the means ontrolled, these differences between 
ed. in the various achievements disappear- 


3. 
there 
achie 


n intelligence alone was controlled, 
πέτα ee no difference between the mean 
. Wh nt of the two groups. 
trolled ns Socio-economic status alone was Con- 
When no res was a difference, just as there was 
e case x M had been controlled, except in 
Which bar the arithmetic score in the sixthgrade, 
first test remained in the area of doubt inthe 
9n which th This arithmetic test is the one test 
Ound to e control of socio-economic status was 
The — an influence. 
the Acce T that there was a difference between 
factors E ed and the Rejected Groups in mo st 
derstang achievement is not as difficult to un- 
tic, the pas, the fact that, in the fall in arithme- 
tor of iiec ationship was so slight. Even the fac- 
to Dring heto fe gaps which was not enough 
hen califo ba Ὁ groups together at any other time, 
arithmetic olled, made the two groups alike in 
ecause κας in the sixth grade. Isthis 
Of the A hmetic, as probably taught in many 
only one ας studied, is a tool which is usedfor 
skills entes Dd a day, whereas reading and study 
Ghia πει, So:many ασ κε that those 
dent to the i Mss these abilities will be more evi- 
νοῶ ναών ability is 
Could it be sey to achievement in the spring, 
t during the summer vacation the 


BUSWELL am 


pupils found little apparent use for it, so those 
who did not excel had no handicap, while those 
children who liked to read found opportunity for 
this pastime and were observed by the others in 
this activity? The causes for this circumstance 
can only be hypothesized. This investigation does 
not attempt to produce answers in regard to caus- 
ality. However, it is interesting to speculate as 
to why certain relationships have come about. 

With the exception of this arithmetic factor 
which seems out of line with the other results : it 
was rather definitely shown that achievement as 
such is related to social acceptability, that it is 
the intellectual factor associated with this achieve- 
ment which is the basic component in the relation- 
ship, and that socio-economic status as such has 
little relationship to social acceptability. 

For the lower grade group the data have been 
summarized in a fashion similar to that of the da- 
ta for the upper grades. The basic relationships 
mmarized in Table VII. Two general re- 


are Sui 
ade about the facts shown in 


marks should be m 
this table: 


1. In the first grade the acceptance and rejec- 


tion of hypotheses follow the same pattern as in 

the upper grades. That is, with no factors con- 

trolled, the Accepted Group had a significantly 
n reading readiness than the 


higher mean score i 
Isolated Group; with both mental ability and chron- 
ological age controlled, the differences were no 


longer seen to exist; and of these two factors, it 
was the intelligence which alone could bring about 
the acceptance of the hypothesis of no relation- 
ship. 


2. The only time in the entire study when no 


relationship was found between those who were 
ho were achieving was at 


acceptable and those w 
It must be remembered 


the kindergarten level. 
that data about acceptability of the kindergarten- 
ers was secured from them in the spring, and 


the reading readiness test, by which their 
«achievement! was measured, was given in the 
following fall. Thus it was found that the child- 
ren who were socially most accepted in kinder- 
garten, and those who were socially least accept- 
ed were not significantly different in the follow- 
ing fall in reading readiness as measured by the 
Gates Reading Readiness Test. 

The interpretation of these two opposing con- 
ditions can only be a reiteration of the facts; the 
children who were socially most acceptable in 
pue epu qe not necessarily the ones who 
were the most successful in a i i 
test the following fall; the πο. Pena 
cially most acceptable in the first grade, were 
in general, the ones who were successful ma 
reading readiness test. Whether this change 
came first in acceptability and then in success 
in school, whether the success in school preced- 


N 
N 
— 
O 
> 
x 


JOURNAL OF EXPERIMENTAL EDUCATION 


48 


IO' G0 ' 
aM C EE uiii aa ———— —— cU X, vu o3 soc T 
SOT  A£'IG pezoaley 


Τ91 Τ91 69] 


‘fay er, 01 I '92V 691 'T I ΟΥ LPL Ἱ I ‘lau 220 ‘ST T 72 83 Ρ9]άθοογ 
SIIPIS Apnig 
POT POT €9I G9T 19 PG pezoelfay 
'99V 8086 I Ec: Sr I aay ees I 1qnoq 4z68'9 I I? LS pe3deooy 
oneurnpgrry 
G9I G9T POT 990] 8490 pezefey 
‘fay οσΡ | I say eee ts I ΗΕ I 19} 794 ‘OT I 08 '}9 pe3deooy 
SUIDt9M 
9 oper) 
€9I SOT c9T POT c8 IS Ρ91099{9}1 
‘fay 148 ΙΙ I '99V Τ6ς 'I I ΟΥ ζ9Ρ I I ‘fay 068 “ET I G9 84 po1de22y 
STIIS Ápnjg 
G9T G9I POT 991 LL TG pəpəfəy 
“fay £99 '4 I ου 987 'I I ΟΥ pGG'T I 19} 669 ϱ I ΤΙ 86 pe1de2oy 
oneunprry 
G9I G9T POT 991  GG'ZG Ρ9199[9}1 
"fay OGL "στ I '99V 611 "T I '99V £90 I I "fay LPL FI I G6 '99 pej3deooy 
ΟΙΠΡΈΘΗ 
G ΘΡΈ 19 


Ug Os . vos ^ Ne DLL” Dh τπτ “πω πει πι κ πο πο απο 
H OT “πα H οᾳ να ST ψῑψ OF "I'd H OF Ad Uea nW 11Π981Π9.1ΠΞΈΘΙΙΊ 
COMO ο ο ο οποτε τε ο. τε... 

5 ΠΒ 


snjtjg 


9IUIOUOO?] -OIOOGS 90119 ΟΤΠΟΊΙΠ pue 9019 οΤΠΙΟΊΙΠ QUON 
a ei ee 
Ρ9ΤΠΟ.«ΠΠΟΌ 5αο]οςῇ 
sss 


SHAVUD UAddN AHL NI GHIGALS SHOLOVA SNOINVA AHL ΝΟ SdnouD ΠΤ ΟΠΕ 
-3H AHL ανν GALdAOOV AHL ΝΠΠΛΑΙ ΠΗ͂ SNVAW AO ALIANADOWOH 10 SLSAL 10 ANVIWWAS 


IA d'IH VAL 


49 


BUSWELL 


September 1953) 


198), 8 = 2% 


ο 

968 Gee 606 996 TROL 

106 08 LIT POT 6ΙΘΛΟΤΙΟΥ 1991004 

£66 GL v6 LZI SI3AJYIV ΘΘΈΙΘΛΥ 

4419 5), 26 LET ς 1ΘΛΘΙΠΟΥ͂ 1995 

μπω πι a 24:71: - E νυ εδ... ........-------------- 
TEOL ΒΙΘΛΟΤΠΟΥ ΞΙΘΛΟΤΙΟΥ SIO9A9TUuOV 

15οαοοᾶ ΘΟΡΙΟΑΥ 1594 


Spuslig Se uasoyd 
απ ο ως 


GN3INA LSAd SIH AO SALLITIGV ASAHL ANV STTIDIS AANLS ANY 
*'OLLSINHIINV 'ONIGQVSH NI ALITIAV S. TidNd V NAAMLAJ dAIHSNOLLV'TIAYN 


IIIA 3 Id VL 
(ντ = *u 
c8 9 1656 
(I = 
TO 80 ' 
ert ert (44! PPT 68°9S Ρ9199191 
‘fey 009 8 I 'ooV γαρ 6 I How OV I ‘fou T96 TT T cv το ρο]άθοον 
I θΡτεαΏ) 
αντ αν] ΤΤΊ DFL 061.58 Ρ9]09[Ι9Η 
39ΟΥ OSS T I "90ΟΥ $6GL'6 I πω ση E 39ΟΥ CEP € I 0219 Ρθο]άθοογ 
U9]19?319pUTM 
H “a ‘i'd H °g A'A H 4 A a H O4 ΠΏ ute dnoiy 
asy Θ09ΦΠΙΘΊΙΤΠ ΘΠΟΝ 
15ο919ο]οπουΌ 9019 ΠΠΙΘΊΤΠ 


ΡΘΊ|]0.Ππ09 5101951 


αΠΟΗΌ πανᾶς YAMOT JHL NI SSANIGVAY ONIGVAY NO 54ΠΟ8Ό 
απΙ ντος AHL ANV Ω5Ι,4π09Υ AHL MELT SNVAW 1Ο A.LIHN4D2ONWOH AO SLSAL 1Ο AHVWUWS 


ΠΛ A'ISV.L 


50 JOURNAL OF EXPERIMENTAL EDUCATION 


ed the change in social acceptability, or whether 
these changes occurred Simultaneously, remains 
a matter of speculation. : 

To summarize the results obtained through 
the analysis of covariance, it may be said that 
at the kindergarten level social acceptability does 


Investigation into the Possible Existance of 
“Clusters” of Pupils in the Classrooms 


It seemed plaus- 
ible that in Selecting one best friend, a child might 


S much like himself in reading 


,in intelligence, 
In each division, 
€ of every individual in the class 


ment has been established, 
that more people would choose th 


> , the propórtion 
of these best achievers to be chosen by each of 
the three groups would be the sa 


me. This does 
not appear to be the case, More of the best a- 


chievers were chosen by their own group than by 
either of the other two, and more of the Poorest 
achievers were chosen by members of their own 
group than by the best or average achievers. The 


(Vol. 22 


117 poorest achievers who made their choices 
bod! the ranks of the average achievers mici 
that there may be a trend toward choosing cone 
one better, but not too much better, than ΠῚ 2 
However, as the chi-square value is only 8.788%, 
which is below the 5% level, the relationship E 
not statistically significant. Similar procedure 
Were carried out for intelligence and for eel 
economic status, but on no measure was therea 
Significant relationship between the type of per 
Son choosing and the type of person he chose. p 
Therefore it cannot be concluded that the class 


room ‘‘clusters” are based on any of the five 
factors investigated. 


Summary of the Analysis of the Results 


When the problem of the relationship He 
Social acceptability and achievement was set forth, 
Several questions were posed in the e ip S 
that the investigation could find an answer to ther 
The following Summarizes the above results ac 
cording to their contribution toward an sw ering 
these questions, 


First, do the most accepted children and the 


S van 
least accepted children differ in any of theachiev 


ments to b 
ysis of να 
that there 


© measured? The results of the anai 
riance and covariance have indicate 
is a very definite difference beween 
these two groups in each factor measured at 5 
levels except the kindergarten. In addition, th 
results indicate that if the factor of intelligence 
is controlled, the differences no longer exist. Ξ 

Second, if there is a relationship between ae 
cial acceptability and achievement, is it ο 
at the end of the School year than after asumm as 
vacation? In the upper grades a relationship Hi 
found to exist between Social acceptability an a 
both reading achievement and study skills befor 
the summer vacation and also at the end of tne 
vacation. Between arithmetic ability and aoe 

ity the relationship was not as en 

i At the lower ών ^ 
£e took place durin, thesumm - 

months or early mer of the fall Before the μή 
mer vacation, πο relationship was found betwee t- 
acceptability ang future reading readiness as tes 
ed in the fall; after the summer this relationshiP 
was a Significant One. The cause of this change - 
cannot be definitely determined by the present i? 
vestigation, 

Third, do Children tend to join in clusters 0? 
the basis of any of the factors tested? The chi^ 
Square analysis Presented indicated that no oir 
ters of individuals may be found which are sig? 
icantly different, in the achievements measure; 
from Clusters which might arrange themselves 
by chance factors alone. Observation of the m 
indicate the Possibility of a trend towa-d choos 


re 


> 


> 


l—XÓ 


September 1953) 


ied ο. who are a little, but not too much, bet- 
oneself. 
be RA. will the relationship, if any exists, 
en? Hex er in the sixth grade than in kindergart- 
and cova € the results of the analysis of variance 
acena iaa "ped that no relationship between 
readin ility and achievement (as estimated by 
hi F test scores) exists at the kind- 
between Pace while a significant relationship 
at the d Social and academic factors exists 
Ship appe grade level. Therefore, the relation- 
ary pears stronger at the end of the element- 
Years than at the beginning. 


Conclusions 


vitis LU imary conclusion in this study, theone 
we Benes T the null hypothesis, is that when 
either "s er a classroom of boys and girls in 
may be i early grades or the upper grades, it 
ing in th 814 that in general those who are succeed- 
in their cd School work will also be succeeding 
um relationships with their peers. 
an inter Š tation would seem possible to draw 
is telan OA in regard to the way in which 
Success maa nip between social and academic 
that achie S come about. A tentative hypothesis 
ity, pranada oni, as a basic factor in acceptabil- 
ity is sugo S rather than follows this acceptabil- 
gested by the change which takes place 
e time ied time the child is in kindergarten and 
Xperience ^ first grade and is beginning to 
success jg ὍΘ of the activities of which school 
demie suce πας. In kindergarten, before aca- 
not Men is evident, the future achiever 15 
Wently η Social relationships any more ire- 
the first the future non-achiever. Early in 
men grade, when a different kind of achieve- 
Vident Occurred in kindergarten is becoming 
Ment ar those who are successful in this achieve" 
ACcoptap; also the socially most accepted. Ἡ the 
men Pie were responsible for this achieve- 
garten ens Who had been popular in the kinder- 
to succ ue, Preceding spring should be the ones 
iness A in school in the fall. In reading read- 
ment) this measure of estimated future achieve- 
nce towns not the case. Since there was à 
those wi Biss d increased social acceptability for 
Seem that is succeeding in school, it would 
Further -S Success might be a cause. : 
i T evidence corroborates this nypothesis. 
Shown in thi interpretation is based on the fact 
ion or inclus Covariance analysis that the exclus- 
can make on of intelligence into the situation 
cial accepta soy the relationship between $07 
In the relati ility and achievement. If we believe 
that this is ive constancy of the I. Q., we know 
i e λος because the acceptability is affect- 
intelligence „o C6; therefore it may be that the 
fectin e and its resulting achievement are af- 
the ac zeptability. 


BUSWELL . σι 


Implications 


However, even if we were sure that theachieve- 

ment was influencing the acceptability, there is 

no evidence from the present study to indicate the 
two factors continue to operate in the same direc- 
tion throughout all of school. Even if the first 
step in the process of becoming rejected in school 
is a matter of non-achievement, it seems possible 
that a circular reaction might be set in motion so 
that once a child is unaccepted he becomes inse- 


cure and finds it more difficult than ever to suc- 


ceed. 
The analysis which shows that, by holding in- 


telligence constant, these differences may be 
eliminated is, however, actually based on a hy- 
pothetical situation, for there is no possible way 
in which intelligence can be held constant in an 
actual classroom of boys and girls. Holding this 
factor constant in the analysis has given us the 
opportunity to ascribe to it much of the foundation 
for the fact that some children are accepted and 
others rejected, but it does not give us any help 
in solving the problem which faces so many of . 
our boys and girls of how they may become 5061- 
ally more accepted. 
telligence may be the foundation of the dif- 
Ετος beret the accepted and rejected child, 
but in the eyes of the children this difference 18 
often manifested in the form of school achieve- 
if a child's achievement may 


ment. Therefore, à Y 
be improved, even if his basic intelligence may 
not, might this not make him more acceptable? 


For example, might not one of these achievements 
needed for acceptability be the ability to contrib- 
ute to classroom discussions? The child who can- 
not play baseball is not chosen for the team. Will 
the child who stumbles in oral reading be chosen 
for dramatizations? If it is true that the child 
who achieves is more likely to be accepted, it is 
not so important for the teacher to work onthe 
acceptability as such, but on the achievement 
which seems to be back of it in many instances, 

It is true that some children will never excel in 
the type of work they are given in many School 
situations, but there may be some way in which 
they can succeed. It is also true that there are 
many who, for one reason or another, are not 
working up to their ability and for whom greater 
success might be developed. And there are still 
others who may not know how to use the skills 
and abilities which they do have. This is a ser- 
ious fact for teachers to consider. If by the way 
we gear the work we control the acceptability a 
child will have, provision for individual differ- 
ences takes on even more significance. Ἡ all 
teachers adjusted the work so that all children 
were “succeeding” at their own level, would we 
get this same picture? This presents to the tea- 
cher a real challenge. 


52 


D 


- Bonney, Merl E. 


- Criswell, Joan H. 


- Holzinger, 


5 Tohnson, Palmer O. 


- Kendall, Maurice G. The 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. 22 


BIBLIOGRAPHY 


Bonney, Merl E. “The Relative Stability of 
Social, Intellectual, and Academic Status 
in the Second, Third, and Fourth Grades, 
and the Interrelationships between these 
Various Forms of Growth, 2 Journal of Ed- 


ucational Psychol gy, XXXIV (February 
1943), pp. 88-102. 


Relationships of Some Factors to Mutual 
Friendships on the Elementary, Secon 


5 Sociometry, 
rurary 1946), pp, 21-47, 


“Social Structure Revealed 
in a Sociometric Retest, ”’ Sociometry I 
(October 1939), pp. 69-75. ί 


Karl J. Statistical Methods for 


Students in Education (Boston: Ginn and Co., 
1928), pp. 372. 


5 a Statistical Methods in 
esearch (New York: Prentice-Hall, Inc. 
1949), pp. 377. à 


Dames voL Advanced Theor 
istics, Vol, Į London; Charles Grif- 
fin and Co, ; Ltd., 1947), pp. 457. 


- Loeb, Nora. Th 


i McConnell, T. R. and Bond, G. L, Predic- 


10. 


M. 


12. 


13. 


aM, 


- McLendon, Ida R. 


tion of Stanford-Binet I. Q. 5 from Gr our 
Tests at Earlier Grade Levels. Unpu 


ed report, Bureau of Educational Benoa 
University of Minnesota, Minneapolis, 
39. 


An Investigation of Fac. 
tors Associated with the Social Acce tane 
Hamilton, Ohio. Unpublished Ph. D. thes: Ἡ 


Ohio State University, 1947. 


Mitchell, Mary A. Tne Relationship of Read- 


A de 
ing to the Social Acceptability of Sixth Gra 


Children, Contribution to Education, No. 
953 (New York: Bureau of Publications, 
Teachers College, Columbia University, 
1949), pp. 59, 


A sh- 
Moreno, Jacob L. Who Shall Survive? (Wa 


μη 
ington, D. C.: Nervous and Mental Diseas 
Publishing Co, ,; 1934), pp. 437. 
Raths, Louis, "*Evidence Relating to meya 
idity of the Social Acceptance Test, ”’ Edu- 
cational Research Bulletin, XXVI (Septem 


Staker, Anna Marie. ‘Changes in Social Stat 
us of Elementary-Schoo] Pupils, "' Educa d 
tional Research Bulletin, XXVII (Septem 
1947), pp. 157-158. 

Young, L. L. *'Sociometric and Related D 
niques for Appraising Social Status in an 


Elementary School, >? Sociometry, X (May 
1947), Pp. 168-177, 


| 


M 


inh -— -ᾱ 


RELATIONS AMONG FACTORS OF RAW, DEVI 


ATION, AND DOUBLE-CENTERED 
SCORE MATRICES . 


CHESTER W. HARRIS 
University of Wisconsin 


relations dd RPOSE of this paper is to develop 
Scores, and os factors of raw scores, deviation 
ment p 2nd double-centered scores. ‘The treat- 
e algebra c and is merely an application of 
e interest he specialized matrices in whic 
alence of m of the factor analyst centers. Equiv- 
and ον ο either square or rectangular, 
notions that. y of square matrices are general 
use is mad underlie the discussion. In addition, 
of matri E z a specialized case of equivalence 
Senerated p hat occurs if the symmetric matrices 
Same multi y these matrices are elements of the 
relation we group. Finally, a *«doose" 
Matrices ofa matrices that generate symmetric 
Sub-ring, or ifferent ranks belonging to the same 
is employed sub-algebra, of the total matric ring 
Will be assu to describe some of the results. It 
employed s μα that only real score matrices are 
rices they consequently that the symmetric ma- 
Tamian Ga. are not only real but also 
Ors, This i tis, have no negative principal min- 
made Beene merely an arbitrary choice of fields, 
ο] numbers, scores ordinarily are taken to be 
n k 
ation et of the paper is first to describe a no- 
t € matric permits writing deviation scot € s as 
oe ie ee of raw scores and an idempo" 
μα ος should have a number 
ext pa of which are illustrated in this pa- 
ores ας resolution of conventional devia- 
Ὁ is Po the product of factors and factor 
[5 urse, well a This principle 15, of 
scit use i own; as itis summarized here, 
Pot lar ELSE made of the notion of groups of 
este ma tee for which a symmetric idem- 
equate thod Ἐς Ἔν unit for multiplication. With 
the tions Telak lished, it is possible to write 
lati Actors of des the factors of raw scores to 
Score. Ὁ facto eviation scores, and equations re- 
95. The rs of the two types of deviation 
Cent Us ike results give a precise statement 
twa ο halen. principle. Finally, double- 
ident; es of do B are considered. Apparently 
to tified: the uble-centered matrices may be 
linea. factors zelatan of factors of such matrices 
s raw or deviation scores i$ out- 


tion s: 
Sco 

r 
Cond 


ted that an important restriction 
is made in this study. Throughout, it is assumed 
that the total variation exhibited in a set of data 
is to be analyzed in terms of common factors. 
This is a choice that the factor analyst may make. 
The results given here outline possibilities for 
relating solutions if he makes such a choice. 


It should be no 


A Notation for Deviation-Score Matrices 


A matrix, X, ofn rows and N columns, with 
j Re μι) 


elements Xji G = 1,2,3;..,n; T= 
may be identified conventionally with a matrix of 


measurements (i.e., raw scores) on n variables 
for N subjects. Define xji = Xji - Mj as the ele- 
ments of the matrix of measurements in conven" 
tional deviation-score form; that is, with the mean 
of each variable equal to zero. An analogous de- 
viation-score matrix has elements, say, Pji = 
Xji - Μι; that is, the mean of each subject is 
equal to zero. Either of these deviation-score 
matrices might be written as a difference of two 
matrices; it is more convenierit, however, to writ 


them as matric products. Thus: 
XL =llxjill (1) 
where the N-by-N matrix L is written for any N 
as follows: 
N-1 z1 ἘΣ -1 
Ν Ν Ν N 
E N-1 E 4 
N N N N 
E sk N-1 = 
N N η 
a a A baa " 
Ew F = 


54 JOURNAL OF EXPERIMENTAL EDUCATION 


The matrix L has several interesting charac- 
teristics. It is completely determined by N, and 
it is evidently symmetric. Each of its arrays 
sum to zero; consequently, the rows of the pro- 

“duct XL necessarily sum to Zero, i.e., have 
means of zero. Further, L may be viewed as the 
product KK', where K is a set of (N - 1) orthog- 
onal columns giving, with res 


» if we view the col- 
umn labels of X as designating a rectangular car- 
tesian system, the effect of L is projective. How- 
ever, if, as ina Scatter-diagram for correlation 
purposes, we view the row labels of X as desig- 
nating a rectangular cartesian System, the effect 
of L is to translate the axes. Finally, L is idem- 
potent, i.e., L = 13, its non-zero roots are all 
unity, and its rank is (N - 1), given by its trace, 
Α familiar theorem then Shows that the rank of 
XL cannot exceed (N - 1). 


analogous matrix, M, may be used to pre- 
multiply X to Secure: 


MX = [ρα]. (2) 


(n - 1), and 


Zero. 


(Vol. 22 


Analysis of Conventional Deviation Scores 


It is well-known that a rectangular score ar 
trix may be expressed as the product of ine Ὃν 
tangular matrices, one identified as the facto d 
of the variables and the other as the factor sco 
of the subjects.2 Letusassume the : 
main theorems of Guttman's paper and ee 
ize the resolution of a matrix of deviation scor 
into principal-axis factors and factor scores. jx 
Note that since Lisidempotent, XLX' equals X de 
Let X and consequently XL be real; then the 5 
metric matrices XLX' and LX'XL are real "s 
Gramian, i.e., all their non-zero latent r p 
are positive. If XL is of rank r, then both end 
and LX'XL are of rank r and have exactly r n 
zero latent roots, Flanders has shown that ding 
elementary divisors of AB and BA correspon 
to non-zero roots are the same. 3 secre, ae 
XLX' and LX'XL which are of the same rank aie 
the same non-zero roots. In fact, if (y) is jede 
acteristic vector of XLx' corresponding to à ea 
Zero root, k, then (LX'y) is a characteristic ο 
tor of LX'XL corresponding to the same non 
root, k. 

Let G designate the matrix of r unit-length - 
characteristic vectors of XLX! corresponding ani 
its r non-zero latent roots. Let r4n<N. Th 


XLX'G = GD?, 8 


where D? is written purely for notational conven 
ience. D? isa diagonal matrix of order r an os” 
non-singular; its non-zero elements are the (P 
itive) latent roots of XLX', We shall use D E 
designate the diagonal matrix of positive 54 eed 
roots of the non-zero elements of D?. NOW; sal 
the columns of G are unit-length orthogonal C 
umns, 


G'G - IH, (4) 
GD(DG'GD)-1pc:. Gq: (5) 


and 


š E t 
iS à symmetric matrix that is a right and lef 
1. An exercise in Ferrar is pertinent. See: 


sity Press, 191), p. 108 


Karl J. Holzinger, "Factoring Test 
chometrika, IX (Sep 
Louis Guttman, 
1944), 1-16. 


Cyril Burt, 'Correlations Between Persons, 


" Brit 
59-96. 


V. Le Ferrar, 


and licati 
tember 190), 155-167. Imp. lons for the 


"General Theory and Methods for Matric 


Algebra (Oxford, England: Oxford Univer" 


See, for example: 
hod of Averages," Psy- 


Factoring a" 


Psychometrika, IX (March 


ish Journal of Psycholo 


en Foyehology, XXVIII (July 1931), 


3ο Harley Flanders, "Elementary Divisors ο 
Society, II (December 1951), 871-873. 


f AB and BA," Proce, 


edings of the American Mathematical 


September 1953) 


unit 

um x d GG' also is a pre-multiplication 

μας y factor matrix that reproduces XLX'. 

EE ο. T ronem (f) guarantees the existence 
y-N orthogonal matrix, V, such that: 


XLv = ||F oll . (6) 
wher 
Merit d LEIDEN, Then GG', which is a pre- 
multipli ation unit for F, necessarily is ἃ pre- 
plication unit for XL, since 
GG'XLV = XLV (7) 
m^» iso τος 
Ree ering the matrix LX'XL and equation 
LX'XL(LX'G) = LX'GD*. (8) 


Note th 
not of d r characteristic vectors (LX'G)are 
ength, since from (3) and (4), 


G'XLX'G = D?. (9) 


Nor 
malize 
these characteristic vectors to secure: 


LX'GD"1 = P, (10) 


that i 

s 

is Ὃ ru of normalized factor Scores. 

αμ evident that TE y p, which im- 
ne each column of P sums to Zero and that 


CesSarilv i 
(10): Ssarilyis singular. Then, from (3) and 


HARRIS 55 


is to operate, by an iteration pro 
matrix XL itself. This would be ο ον ὃ 
the deviation scores directly.9 A second is = 
tora τα pr solve for its charac- 
2G. e accompanying non- 
zero roots. Then equation (10) yields P. Athird 
approach is to form the matrix LX'XL and solve 
for its characteristic vectors, P. Then either 
(11) or (12) yields the factors, GD. Consider 
procedures two and three. Both of them are equal- 
ly «correct in that they arrive at the same re- 
sults, even though they employ two different sym- 
metric matrices for analysis. Clearly a *recip- 
rocity’’ holds for the analysis of XLX' and of LX' 
XL, even though the one matrix, XLX', would 
commonly be judged to be ‘‘meaningful’’, and the 
other not. Note that XLX' is exactly amatrix of 
variances and covariances with each entry mul- 
tiplied by N, and that it can be reduced by pre- 
and post-multiplication bya diagonal matrix to 
the familiar correlation matrix with units in the 
diagonal. Burt’s reciprocity principle, there- 
fore, is true for any matrix of data, Say, Y in 
that the principal-axis transformation of either 
yy' or Y'Y leads to the same resolution of Y. 
Burt, of course; specified in his proof the case 
! and Y'Y both are meaningful. 6 pater 
we shall write equations that describe the cases 
Burt regarded as similar in principle but as ex- 
nly an approximate reciprocity. 

es that the total variation 

accounted for in terms of 
We might have chosen GDW as 
here W is a non-singular transforma- 


factors, W 
than GD, the principal- 


tion of order T, rather 


XLP = 
P = XLX'GD-l = GD, (11) 
9r, sin axis factors of xLX'. Note that substituting GDW 
ce LP - p, for GD in (5) yields the same unit, GG', as be- 
fore and that consequently the effect of selecting 
XP - GD. (12) GDW rather than GD is merely to induce the in- 
The sin, verse of this transformation on the right of P. 
Potent ege matrix PP' is a symmetric idem- These transformed matrices reproduce XL ex- 
E Pre mul a right and left unit for LX'XL and actly: 
Stration .... unit for LX', by 4 demon- 
bc Thera ee that for GG’. Also, P'P- xL = (GDWY(W"1 P' 
refore, (11) may be written: ( X ). (14) 
"Thi XL = GDP'. (13) However, 
iB ds t W'DG) # XLX' 
Principal ο ὃς aat of XL into the product of (GDW)( )f (15) 
r s í = 
nores, ji ER GD, and normalized fac unless W is orthogonal. Instead: 
dese sre are thr T " š 
ribed ; ee ways in which the resolution WWW -in ma = 
Leg 3 in (13) might be accomplished. one (GDW)( )-1(w'DG') = XLX' (16) 
IB. 
Μονο» GU! m CU = Ins terdum tg dai ec n GG is idempotent and a unit as de- 
Journal’ ος tei ety f iuakion "The symmetrical Tdempotent Matrix in Factor Analysis it 
Sow 5 erimental Education, XIX (March 1951) 239-2h6. ’ 
ee Gutt, 
centro man, op. Cites pp. 14-15, for such 8 suggestione See also Holzinger, op.ci 
6, Bunt id and πατε ρ τς group factoring methods applied directly to a score er, gpecites "m 
? " 100. Cites Pe 12« 


πρ ο: à 
Drelations Between Persons; 


56 JOURNAL OF EXPERIMENTAL EDUCATION 


is generally true. When W is not orthogonal, 
GDW should be identified with an oblique struc- 
ture, rather than a pattern, in Holzinger'sterms, 
providing that we recognize that the elements of 
GDW are the correlation coefficients associated 
with Holzinger's concept of structure only when 
the columns of W constitute direction cos ines. 
Clearly, then, the resolution of a Score matrix 
into the product of factors and factor scores is 
stated generally as a resolution into the product 
of a structure matrix and factor Scores. Though 
it is convenient for the purpose of the proofs to 
Specify a principal-axis resolution, the relation- 
Ships developed below are generally true for any 
method, providing all the variation in the data 
is analyzed in terms of common factors. 


Relation Between F: 


actors of Raw and Deviation 
Scores 


The relation between factors of raw scores 
and factors of conventional deviation scores is 
an interesting one, Resolve the matrix X into a 


product of principal-axis factors and factor Scores, 
thus: 


X-G,D,Pi (17) 


It can be shown that G,Gj and P,P} are units of 
different orders, and that GiG, = PIP, = Ir. Sub- 
Stituting (17) in (12) gives: 


GD = G,D, (Pip), (18) 


a non-singular matrix of order r 
GD, the principal-axis factors of XLX' 
(oblique) structure for XX'. That it is oblique 
(but not necessarily a unit-length Structure) is 
readily shown. With X and XL of the same rank, 
(18) implies that XLX' and xx! belong tothe same 
multiplicative group of matrices with unit G,Gi- 
GG'; this holds for r equal to or less than n. Then: 


; asan 


G(G'G;) =G, (19) 


and 


Gi(GiG)- G (20) 
with (G1G) necessarily an orthogonal matrix. Pre- 
multiplying (18) by G! and re-arranging gives: 


(PiP) = Dri(GiG)p, Q1) 
` which cannot be an orthogonal matrix unless D,- 
D = Ir, which is the trivial case of XX! = XLX. 

From this it can be inferred that with XX' and 
XLX' distinct but of the same rank, P,P! #PP', 


(Vol. 22 


and there is no transformation operating solely L 
on the right of P, that yields P. To see this, re 
call that PD constitutes principal-axis factors of 
X'X. Post-multiply (17) by L and equate itto (13), 
thus: 


GDP' = G,D,PIL. (22) 
Then 


PD =LP,D,(G!G), (23) 


that is, PD is an orthogonal rotation of LP,D;- 
It is then evident that: 


P = LP,D, (G G)D-1. (24) 


Briefly, then, we have proved that with X and 
XL of the same rank the principal-axis factors 
of XX' may be transformed by post-multiplica $ 
tion by a non-singular matrix into the principa is 
axis factors of XLX', and that the principal-ax 
factors of LX'XL are an orthogonal rotation of 
the matrix LP,D,, where P,D, designates thee 
principal-axis factors of X'X, The first state RS 
ment is obviously true for any type of factors Pa 
viding they are non-singular transformations O 
principal-axis factors. The second is true Pe 
any factors that are orthogonal rotations of th ic 
principal-axis factors; it must be modified if no 
orthogonal factors are meant. 

It may occur that the rank of XL is less than 
the rank of X. If so, we have a ‘loose’ relatio 
between the factor matrices G,D, and GD. It 
will be true that G,G| is a unit for XX', et 
and GG' which is a unit for XLX' is nota uni ? 
for XX'. The matrices XX' and XLX' are no Bb 
members of the same multiplicative group, Si” e 
they are no longer of the same rank, but eb 
both members of the sub-ring of matrices of ae 
der n that is defined by pre- and post-multip ry 
tion of every element of the total matric ring G 
GiGi. The results, of course, are trivial if Gs 
is non-singular since this implies the total ma- 
tric ring is taken as a sub-ring of itself. HOH 
ever, the relations given in equations (18), on 
and (24) hold for XL of lower rank than X, Wi 
the matrices in parentheses singular. 


Re 


lation Between the Factors of X and MX 


e 
Recall that MX is an alternate deviation-ScO” 
matrix, that is, one in which the means of 
Subjects haye been made zero. Write: 

MX = G,D,P}, gn 
Then, using (17), we may equate: 


6) 
MG,D,P} = G2D2P}. e 


2m 


ee o 


September 1953) 


Two consequences of this are: 
MG,D, = G2D2(PP1), (27) 
and, since MG, = Gp 
, 
P,D,(G{Gz) = PaDe. ve 


Whe: 

Dm iade MX are of the same rank, as they 
Pi. Ἢ der N is less than n, P,Pi equals 

(PP, } i. G,G! is not equal to G2G}; the matrix 

5 Mee thogonal; and (GjGz) is non-singular. 

d XL. 5 are the counterparts of those for x 


Relati 
lons Between Factors of MX and XL 


T 

Set = of MX and XL is the case ofthe same 
Snir BS into the two possible devi- 
tionships reforms. Then X'MX gives interrela- 

» times = the subjects expressed as a scalar, 
9f person. he matrix of variances and covariances 

e variabl and XLX' gives interrelationships of 
the matri es expressed as a scalar, N, times 

ix of variances and covariances of the 

Equations (13) and (25) give the reso- 
and MX into principal-axis factors 
cores. Then we may equate: 


lution x 
of 
and factor 5 


MGDP' = G,D2P3L, (29) 


2; 


Which 
g Bi 


s ; 
es nit that LP equals P and MG2 equals 


- MGD = G;D;(P1P) (30) 
A PD(G'G,) = LPzD2- (31) 
© matrices 1 
rela! trices in parentheses in (30) and (31) are 
in this fashion: 


ὈΖ1(ο; G)D = (P4 P). (32) 


The 
Y m 
hold, Ἄγ, of course, be singular; the equations 


ow 
*Biricgs σε regardless of the rank of these 


q ἃ 

they ees (30) and (31) are important 5 ince 
ο πες the melation io be found be- 
TSn αν of independent analyses of the 

i tion is ingful” matrices X'MX and XLK'. The 
ae ate λα, that Burt judged to be only “approx” 
actually "em the data were double-centered. 
m introg, relation is precise; the complications 
MS αμα by the matrices L and M, which 

ing T. However, the presence of L and 

the relation 


HARRIS 57 


is one between factors—or between factor scores 
—that have been made bipolar, since this is ex- 
actly the effect of M on GD and L on P2Ds. Ital- 
so is evident that given only G;D;, one could not 
reproduce GD and similarly given only PD, one 
could not reproduce P2D2. However, given, Say, 
GD and G;Da (or any non-singular transforma- 
tions of them) it will always be possible to solve 
for a transformation that relates MGD and GzDz. 
This is true because M is known when n is known, 
and the two matrices MGD and G;D; generate 
symmetric matrices belonging either to the same 
multiplicative group (when they are of the same 
rank) or to the same sub-ring (when they are of 
different ranks). The rationale and equations for 
the solution of such problems have been present- 
ed elsewhere. 8 An analogous statement may be 
made concerning the problem of relating a given 
PD and P2Dz. 

It also is possible 
in (30) and (31) in ter 
scores secured from analyzin| 
Write: 


to write the relation given 
ms of factors and factor 
g correlations. 


SaXL = GaDaPa (33) 


MXSp = GbDbPB (34) 
nal matrix that makes the 
jate arrays unity. Then 
correlation matrix de- 
interrelationships of the variables, 


and SpX'MXSb is the complete correlation matrix 
describing hips of the subjects. 


Each 8 is non-singular, and so we may write: 


and 


XL = Sa1GaDaRi (35) 


and MX = GDpPbSbl. (36) 


and then equate: 
MSaGaDaPA= GpDpPbSp 1L, (37) 


to give: 
MS31GaDa = GpDp(PbSptPa) (38) 


E p,Da(GiSaldp) = LSp!PpDp, (39) 


which are analogous to (30) and (31). 
Since the Burt-Stephenson controversy is fam- 
iliar to many readers, it may be pertinent to com- 


he e $ 
; Quations makes clear that ; 
* Orig millan 
Co», 191). See especially bis dins 


Cusg, "UT, Factors of the Mind (New York: The 
ters XVI, WII, and 


on in q cr x 


? SE. cit. 


XVIII of Part III. 


1. 22 
58 JOURNAL OF EXPERIMENTAL EDUCATION (Vo 


ment on these results in that context. 9 Proyig- 


method by which we operate on the rowa of = bos 
i i i t non-zero, sums for de 
i i illing to analyze in terms of common achieve equal, bu m hieves 
ος ail the aration in a given set of data, it Dividing each row by the sum of ire dine 
ps de makes no difference whether he pro- this. Let Sr be the diagonal matrix Shot is doubler 
ences by multiplying together rows to secure a this division, Then it follows that MSr. 
symmetric matrix for analysis or by multiplying centered, that is, 


together columns to secure such a matrix. This 


MSrX = MSrXL. (40) 
is not thought of as a ‘meaningful’? one, Itseems 


For example, Howells reports the use zs sd 
all the variation in terms of common factors; if trix of this type. 10 He found the moi cab - 
So, he probably would judge this generalization measurement and then took each e ie 
to be irrelevant. However, the generalization percentage of the mean for that de adir resi 
does raise a question about the two types of data Thus the mean of each row, in this kar D sdb 
that Steplienson postulates—one appropriate for tion, became 100. Then he correla 
r-technique and the ο 


" $i tor 
ther appropriate for Q-tech- | jects and used a communality principle to fac 


i the 
these correlations: Thus he used a matrix seo 
akes such a distinction mean- type of (40), post-multiplying it by a diago 


give unit variances for the Subjects. Howells 


meaning- raises the question of the relationship of — 
ful." In fact, it was Shown that the raw scores results to those of a conventional analysis c 
themselves, Which generate two ‘‘meaningless’? measurements, and he Points out that an i 
matrices, might be analyzed and the results re- to establish a r elationship did not Mele e e out- 
lated to the analysis of deviation Scores. This tory results. 11 The development gaven Pa es- 
could easily be extended to show the relationship lines one way in which a relationship coul . 
of the results of the analysis of raw scores to the tablished. ixarises 
results of the analysis of correlation matrices, A second type of double-centered wo ro 
Burt's anticipation that a reciprocity would hold when we post-multiply x by a diagonal, Sc, pe 
Senerally, as well as for double-centered ma- effect is to make the Sum of the columns a n 
trices, has been verified and the equations writ- Zero constant. Then: 
ten as (30) and (31) (38) and 39). The results 
given here Clearly Support Burt’s Position, 


41 
XScL = MXScL. e 
Double-Centereq Matrices 
——=——sutered Matrices 


The notation used here s 


" t 
paired-comparison instrumen 
that for any X, MXL is dou 


m 
yields for each subject a set of scores that su 
uggests immediately 


to a constant; a set of Such data for N subjecte, 
ble-centered, 1, e., Would then constitute a matrix (XSc), in whi to 
it sums to zero both by rows and by columns, X actually is unknown Putting this matrix into _ 
Since the multiplication of matrices is associa- Conventional deviation-score form, that is, pos 
tive, it makes no difference whether one first multiplying by L necessarily double-centers inge 
forms (MX) and then centers by TOWS, Or first data. Guilford’s prohibition against factoring soni 
forms (XL) and then centers by columns. It also ative scores ang his recommendation that Q-teC 
1S evident that a resolution of MXL might be writ- nique be used instead 12 apparently is a prefer" 
ten in terms of the.resolution of X, the resolu- ence for analyzing the matrix ΜΣδς, whichis n° 
tion of XL, the resolution of MX, or be written double-centereg rather than the matrix XScL; ed 
de novo for the double-centereg matrix. Thug Which is, The methods of relating solutions bai 
the relations of factors (or factor Scores) of x, here Suggest that equations may be written πας 
XL, or MX to those of MXL may readily be writ- ing that the factors derived from analyzing tne 
ten if they are desired, two distinct matrices actually are related in 8 
Two types of double-centered matrices are manner analogous to the relation of factors of 
worth noting. Consider a Set of data, X, and a raw and Conventional deviation scores. 
, Burt and Willian Stephe; " 3 


10. We We Howells, "A Factorial Study of Constitutio " 
pology, X (March 1952), 91.135. nal Type," Ame 


rican Journal of Physical Anthro- 
Il. Ibid., p. 11h. 
12. J. P. Guilford, Peychological Bulletin XLIX (January 1952), p.3le 


"When Not to Factor Analyse, x 


FACTORS RELATED TO THE EXTE 
MORTALITY AMONG HOME ποσα. 
STUDENTS IN CERTAIN COLLEGES OF 
MINNESOTA, WISCONSIN AND IOWA 
DURING 1943-50 


HELEN Y. NELSON 
Augsburg College 
Minneapolis, Minnesota 


CoL 
er the ἘΠ GE FACULTIES need to consid- 
Percentage ications of the fact that a very large 
drop out a students who enter as freshmen 
cently com ita they graduate. The writer re- 
Mortality P etedan investigation of the academic 
Colleges eim | home economics students in 14 

» and ine ee in Minnesota, Wiscon- 
"n in the thr These represented all of the col- 
n home econ ee states which offered a degree 
Ες ορ-ουέ ͵... n which had records oftheir 
tee who ent nts. Involved in the study were 
0149 through eren uH freshmen from the fall of 
{it of home ο he spring of 1948, and who dropped 

50, ^conomics before the spring term Ο 


Thea 
nitions om visited personally each of the insti- 
th Yding man tained data from official records re- 
lur" relating items; those relating to the college, 
ing to the 5 to the student body, and those Te 
ΜΗΝ ΠΠ rop-outs from the home economics 
fe Tt, the len She learned about the type of SUP" 
fe 5, and th gth of term, the cost of tuition and 
the nt letter. honor-point equivalents for the dif- 
me Student Hs γώ She discovered the size of 
ei in Ga and the home economics enroll- 
Co T, and th S of majors and non-majors for each 
Vered the e number who graduated. She dis- 
aj credits th at entrance of the drop-outs, how 
coe ceived hey had earned and the marks they 
dro E8- Eac and how long they had remainedin 
OD-out ch college supplied the names of all 


ad uts f A 
dresses 9T the period studied and their 


A 
M questi " 
αρα nre was sent to each of the 2263 
eight a becassa * but 197 letters were re- 
Cei, A T cent μὴν κι could not be delivered. Sixty- 
Plies e Tie: the group who presumably T€- 
Ong ; nish Stionnaire responded. Their re- 
for einai information regarding their reas- 
Actiy and d ing in home economics, guidance 
Vities a ing college; their extra-curricular 
> Church attendance and work exper" 


The 

5 

becansa Y involv demic standing 
Se of ed only those who had left in good acade c s 


Scholastic deficiencies. 


iences; home community and living conditions at 
college; their reactions to college courses in home 
economics and in other required subjects; spec- 
ific reasons for leaving home economics and pres- 
ent marital status. 
The responding group Was apparently an unse- 
lected sample of the total population from whom 
information was solicited, because no differences 
iances for four factors (age at 
entrance, credits earned, honor-point ratios, and 
length of stay in college) were large enough to be 


of any practical significance. 
The findings were punched on Hollerith cards 


and the cases were classified into various S ub- 
groups: those from large institutions and from 
smaller colleges; those who had transferred to 
another curriculum or another college and those 
who had dropped out of college; and those who had 
remained in home economics for different lengths 
of time. 
Analysis of the data revealed the facts dis- 

cussed in the following section. Some of the elim- 
ination took place within a few days or weeks after 
entrance, but the greatest losses occurred at the 
end of the first and second year. Only 13 percent 
dropped out after the beginning of the junior year. 
The scholarship average for the voluntary drop- 
outs was 2. 14 (with A equal to 4 honor points per 
credit). Most girls had entered at 18; the med - 
ian length of stay was five terms, during which 
time they had earned an average of 58 credits 
College records supplied little information as io 
reasons for leaving—for 90 percent the reasons 


were unknown. 


Extent of Student Mortality in Home Economics 


The extent of student mortality in home econ- 
omics was determined by calculating the ratio of 
students withdrawing to the total group of fresh- 
man entrants to that curriculum. About half of 
the freshmen entering home economics in the col- 


Eleven percent had been eliminated 


Vol. 22 
JOURNAL OF EXPERIMENTAL EDUCATION ( 
60 


leges studied during the years 1943-45 dropped 


ined 
remained but a short time and those who rema: 
out before graduation. Mortality was greater in 


longer. 


Participation in Activities Outside the Class- 
room 


Most of the respondents from both via 
mall colleges had participated in some πιο 

activities; the longer they were in ped ipa 
more likely they were to report such partic z 
tion. Informal social contacts were aa xo 
quently listed as having given them satis Decl 
next came activities in church groups, at niin 
and music. The modal number of hours SP GITE 
extra-curricular activities was less than aee 
a week, but about a tenth of the students i CE 
that they usually devoted twenty hours a we 
more to such activities, —ÀGM 

Respondents (from both large and sma our 
leges) who remained in college less than a i 
Were likely to have participated in Puri 
activities less in college than in high schoo ignit- 
Those who Stayed longer than a year were pe 
icantly more likely to have participated as m E 
as in high school or to have increased theam 
of their participation in college. i 


small colleges (32 percent) The percentage of 
students who dropped out of home economics in 
these colleges ranged from 19 to 55 percent, 


colleges said they transferred to one where the 
Costs were lower, 


Home Communities 
= emmunities 


Over a third of the respondents from Ὃν ei 
colleges and nearly half of those from sma ula- 
leges came from towns of less than 5000 PO Gn 
tion. Approximately a third of them an ανα 
towns with Populations of 5000 to 50, 000. anis 
of the respondents who left large colleges c 
from cities of over 50, 000. 


Living Arrangements During College 


ived 
More than two-thirds of the respondents e Y 
in dormitories while attending college; Pelatives; 
less a fourth lived with parents or rela live 
about a tenth in rooming houses, Those who 


in home economics; a sixth gave 


8, and the others said 
that they had had no rea] interest in the field but 


into the field of their 
first choice, 

Unless they had Studied home 
the senior high School, few liste 
struction as a factor causing the 
that course in college. 


Percent of those who 

About three-fourt 
had attended a Colle 
munity but this pro 
Colleges, The dis 


ollege, they ap- 
parently received more help from Counselors than 


they had in high School; but fewer reported help 
from any source in college than in high school, 
The picture was similar in the case of those who 


Religious Affiliation 


Most of the respondents had maintained i 
Same religious lliation as they had at ho P ug 
The relatively ew who had changed after σας πεά 
to college were usually students who had rem 


i 


September 1953) 


a year or 
college hec τ Most students who stayed in 
een church atienden. reported that they had 


Reacti 
ctions to College Courses 


Mor 

colleges Man half of the respondents from large 

(especially | ee the prerequisite requirements 

Significant] ose in science) as unsatisfactory; 

Presseg ni fewer from the small colleges ex^ 
S reaction. More respondents criti- 


Cized the Y 
quantity than the difficulty of the courses. 


About 
leges pets * f the respondents from large col- 
f ught in (En deni economics courses had been 
ewer from th eoreticala manner. Significantly 
"iin e small colleges expressed this re- 
lgnifi 
aif ficantly more of the drop-outs from th? 
amount of PE (56 percent) cited an undesirable 
ege ο ώμο. between high school and col- 
me Írom the only about a third of the respond- 
lon in high e large colleges who had had instruc- 
Nes sh School made this criticism. 
ed Small colle of the respondents who had attend- 
and 55 had ο ges believed that the practical 
d nearly two. Drs too late in their curriculum; 
o-thirds of those from the large in- 


Stituty 
Ons ex 
of b pressed this reaction. The replies 


ilar on th 


is point. 
Marita} Status 
Thre 
τὶ e-f 
ae at the aes of the respondents were mar” 
ates before eae of this inquiry. A few had mar- 
i rting college; some married while 


d immed- 


atten, 

i ing s 

wee Chool, and one in ten marrie 
ty of those 


te 4 
r leaving school. The majori 


τ 1Ο Τη 
arrie 
S college had done so within a year 


Em 
lo: 
m 
ent During and After College 


Le 

la; Sst 
Ned r man half of the respondents from either 
γα "Ses thro colleges had earned any of their 
Of the although = work done during the academic 
ing ar expen: a few had earned more than third 

urin 565. More than half reported earn- 
rie το - ειτε amer T 

6 . 
i Worked. the respondents who Were mar- 
them we after for some time at a job outside 
maj; Ported marriage, although 80 percent of 
no occupation other than hom è- 


Abo Batt 
e ti Y 
ut time they received the questionnaire. 


Mony, 2 four 
nths Es them had held a job for a few 
two ye 0 ce es ας worked from 
Sar S. Some of the women had worked 


S or 
Whip Mises pui. Very tew bad held jobs aS 
utes Years. Only a few had held jobs 


ized the; 
their home economics training, 


marri = 
ed and single respondents were sim- 


after leav- 


NELSON 61 


whether they had stayed in ho i 
h me econom: 
or more years or only a short while. mi? 


Reactions Toward Their College Experience 


While more of these young women who had 
spent a year or more in college expressed favor- 
able reactions toward college, those who were crit- 
ical were much more vocal, going into a great 
deal of detail to explain what they did not like 
However, even though many respondents offered 
criticisms of their college home economics 
courses, a large number of them stated they were 
sorry they had not stayed longer; others remarked 
that they wished they could return for more study 


in home economics. 


Implications of the Investigation 


A certain amount of elimination must be ex- 


pected from any entering class, but a loss of half 
of the students entering home economics as fresh- 
men should certainly make faculties pause to con- 
sider its implications. The fact that the gre at 
majority of these students who left home econom- 
ics before graduation were apparently competent 
enough to have finished their courses disproves 
the belief held by many people that most of the 
not of college caliber. 


Since marriage 
cited by responden 
it would be well to ins 
what is needed for a we 
home and family living 
in present-day society, 
homemaking programs in 
tion of these needs. A 
of courses, the sequence Οἱ 
types of exp 


important outcomes 
Since the greatest losses occurred at the end 


of the first and second years, it would be well 
for home economics departments to examine the 
heir programs with particular 


first two years of t 
care.to determine the appropriateness ofthe work 


for students who will not complete a four-year 

curriculum. 
Since a thir 

college expecting 


d of the respondents who entered 
to prepare for a career insome 
area of home economics did not carry out their 

lans, the basic program for all majors should 
be appraised in the light of education for home 
and family living. It may be that a series of 
courses set up to prepare students for a profes- 
sion (even though it is a home economics profes- 
sion) does not furnish optimum preparation for 
home and family living. 

Certain factors in home economics curricula 

to be related to the students' loss of in- 


appeared 
d to withdrawal before graduation. One 


terest an 


62 JOURNAL OF EXPERIMENTAL EDUCATION 


and duplication of effort may be avoided. Time 
thus saved by the able student exempted on the 
basis of tests from elementary courses in home 


Seemed to be related to 
loss of interest and withdr 


indicated, 


Lack of effective guidance and counseling be- 


fore and during college attendance Seemed to be 
associated with Withdrawal. 


er distinctive oppor- 


tunities for helping her students through counsel- 


ing. 


(Vol. 22 


None of the following environmental and B 
sonal factors seemed to be associated with με" 
drawal from home economics: the place of » em 
dence while at college, the distance of the s ehe 
ent’s home from college, church affiliation, Es 
ticipation in extra-curricular activities, par 
time employment while attending college. αὐ 

The size of the institution seemed api ^ 
relationship to many of the factors studied. Lg 
respect to some, however, differences ums typed 
Suggesting strengths and weaknesses of bo . 
of programs. Student mortality in home 60 αρ 
ics was greater in large institutions. κο δη 
Spondents who had left large colleges s δη 
the number and difficulty of prerequisite ver ne 
ments; more of them felt that the practical uhi 
economics courses were not offered soon gne 
and more of them felt that their home s cui 
Courses had been too theoretical. However, the 
of the respondents from the small than uoces 
large college considered that their elemen udi 
college courses had been a repetition of mal 2 
ials they had studied in high school eemper t 

This study has revealed many phases of a 
problem of the withdrawal of home ο is 
Students which merit further research. ai ye 
consideration should be given to a study of eee 258 
is needed as a well-rounded preparation for tud- 
and family living. Then such experimental a us“ 
ies as the following might well be carried aonta 
ing pretests as a basis for locating able stu pn 

who could be exempted from elementary peat 
in home economics; teaching beginning pete 
onomics courses without prerequisites; aes 
ing shorter Chemistry courses for majors 0 
than in dietetics and food technology. inding$ 

It is, of course, not assumed that the US 
of this investigation of a group of drop-out to 
ents from colleges in three states would app 5» 
all colleges that offer home economics cours 


im" 
to a degree. But it is believed that Si 
ilar studies by home 


ne : their 
a appraising e 
institutions and in improving them. 


ERRATA 


The last line of column 2 on page 269 of the Caffrey-Wheeler article 
tal Education, 


published in the March 1953 Journal of Experimen 


should read 

either (A+ B) or (C + D) # y 
That is, the formula (3) cannot be used unless the sums indicated are 
not equal to N/2. 
or use when the entry in any cell is 
some texts specify) can be embodied 
ncisely stated as follows: 


The so-called Yates correction f 
equal to or less than 5 (or 10, 88 
in our formula (3). This could be 60 


**When any cell entry is equal to OF less than a specified min- 
imum (say, 5), 2 corrected value may be computed by decreas- 


ing the absolute difference |B -~ by 1, as follows: 


|J N(IB-DI-D* —" 
(Be) «- B-D) 


(B + D)(N - 


* 


~ 


Journal of Experimental Education 


Volume xxr 
SS December, 1953 
Number 2 


A SIUDF OF CERTAIN EFFECTS OF THREE 
PES OF LEARNING EXPERIENCES IN 


.ART AS REVEALED IN THE DRAW- 
INGS BY PARTICIPANTS’ 


GIFFORD C. LOOMER 
Eastern Illinois State College 
Charleston, Illinois 


ield trip to supply sup- 
this group, in the field 
nsory material rele- 


SECTION I person at the site of the f 
plementary information, 


DEFIN 
ITION AND IMPORTANCE OF trip environment where se 
THE PROBLEM 


IT 
Co IS T 
e ae 8 the HE purpose of this investigation to 
Prop ποες in q tive effects of three differentart 
from m situati rawings made in three specific 
togeth € three ον Since the drawings resulting 
Meri er for the ifferent art experiences were put 
nel Fa Y equal evaluation of all drawings, nu^ 
Beyo Signed to groups were used, with person- 
Sig this Fee group by random number. 
for t b compa mmon action themes for all groups 
ls © Conditi rison of drawings without regard 
: With ons under which the drawings Were 
ing the 
Videq an agp on problem of creating ἃ draw- 
data With Var n theme the three groups werepro- 
ln follows: E amounts and kinds of sensory 

Co 
Proy; ntrol 
the vided hee (X)—With no sensory material 
roy = Seber in the classroom, following 
Se Expo s ted a ace the action theme, this 

nt, Perime ing. 
Sentio of com Group 1 (Y)—After oral pre- 
teria” of à : action theme, followed by pre~ 
“On p AN Pigs color film containing sensory 
Ment Pme, t p to the subject matter of the aC" 
Ste ον fated a group, in the classroom environ- 
tion Ser imental Geran on the action theme. 
i Group 2 (Z)—After oral present~ 


acti: 
on theme, and with a resource 


Ege ih 
Wisconsin, 1951, A. Se Barr; 


iu 
DE ΕΠ 
Ysity tted in partial fulfillment of ipe η for the degr! 
a or. 


vant to the subject matter of the action theme was 
present, created a drawing on the action theme. 
Three different action themes were selected 
with an attempt being made to ‘choose the subject 
‘machines’, ‘an- 


matter’ from the categories of 
Απ action theme was used, 


imals’, and ‘people’. 
d subject theme, because it 


rather than a one wor 
was thought that this would offer the student more 
of a challenge, and give more opportunity for 


creative expression of unique, individual nature. 
The three action themes on which all individuals 
of the three groups made drawings, are as fol- 


lows: 


1. «Fire Engines Going to a Fire" 
2. “Activity in the Lion Cage" 
3. “Active Children in a Country School" 


ntal method of research was 
college classes in Art. The 
e taken from art classes inas- 


The experime 
used, with three 
three groups Wer 
much as: 


a. More directed interest is achieved in this 
way. 

b. Some proficiency is needed by the studi 

in the interpretation of sensory data udents 

c. Some proficiency is needed in the handling 


of materials. 


On a theoretical level, the study rises out of 


ee of Doctor of Philosophy, 


66 JOURNAL OF EXPERIMENTAL EDUCATION 
o 


the current need to plan learning experiences. 
wisely, and use available time effectively to give 
maximum results. In a discussion of the func- 
tionalart program, Hawley* recommends that 
the program must grow out of, and grow withall 
other learning experiences, and she emphasizes 
the necessity of reaching a greater number of 
students. She also stressed the fact that during 
any one day there are many varied opportunities 
for art enjoyment and appreciation of form, line, 
color, and texture, which may be realized through 
observation and manipulation. Either manipula- 
tion or observation provides real art experience 
which is vital and functional because of a felt 
need, recognized through personal contact by the 
individual with an actual art situation. 


The assumptions basic to this investigation 
are as follows: 


1. Certain experiences in the learning situa- 
tion are more meaningful to the individual 
than others, because, 

8. Where a certain commonality of exper- 
ience exists, these appear to be more 
naturally within the experience of the 
individual, 

- The ‘doing’ is intriguing and interest- 
ing. 

C. These experiences are in harmony with 

professed beliefs. 

- These experiences appear to offer the 
greatest potentialities for progress to- 
Ward recognized goals. 

2. In the process of evaluation a more valid 

judgment can be secured on the total worth 

of a drawing when attention is focused on 

a composite, rather than on isolated parts, 


. Each drawing must be regarded as an arti- 
ficialform, since it is à graphic two di- 
mensional interpretation of an individual 
experience, 


The problems to be faced in the investigation, 
and the related assumptions suggest the follow- 
ing hypotheses: 


1. For the maximum results from a problem 
Situation, one must have the resources for cop- 
ing with the situation at hand. 

2. It is important for the individual, espec- 
ially where sensory cues are utilized, to respond 


to the total situation, not to items of fragmentary 
nature, 


Individuals were informed prior to making the 
* Cecelia L. Hawley, 


?Ht Melvin 4. Haggerty, Art A Way of life (Iánneaplis: 


"Broadening Our Conceptions of Art Education, 


(Vol. 22 


d 
drawings that these three questions would be aske 
of each drawing: 


1. Are the forms clearly presented? 

2. Does the drawing convey a message S B 

3. Do all parts function to give unity to 
drawing? 


idation 
The source, process of refinement, and Lap 
of these statements of criteria are describe sti 
detail in Part A, Section IL. In terms of ident 
dent making the drawing, the criteria were 
to be most functional in question form. s done 
The evaluation of individual drawings was do? 
by a jury of five college teachers in Art Edu wings 
Each member of the jury, working on all bars 
of a single action theme, evaluated each dra ie 
on its total merit. In making this i page stu- 
judges were aware of the criteria used by ας 
dents, but each was asked to use his own Lem 
criteria of what constitutes merit in a arana αρ” 
These personal statements of criteria were part 
mitted to the investigator, and are listed της xz 
A, Section I. Except for knowing that all pori 
ings judged in a series were made by mem ο 
Οἱ one class, the members of the jury were NE 
informed as to the identity of individuals who ™ 
the drawings, or the conditions under which struc” 
drawings were made. A committee of five ποῦ 2 
tors in Art Education at the University of baee. 
Sin served as the jury that evaluated the dra 
in this investigation, tion of 
To establish the reliability of the evalua’ ete 
drawings, the judges re-evaluated one comp 
action theme after a two month time interval. 
A functional curriculum in the modern E. 
has no place for Art as an isolated subject, tity ἐπ΄ 
any subject which attempts to exist as an en! tive’ 
dependent of life. One criterion of the ae to 
ness of a subject area such as Art is the € every” 
which activities and experiences are tied to and 
day life. Art has been defined in many way ur 
any definition, to be acceptable, must serve an 
contemporary society. The breadth of AT S e 
the necessity for its application are emphaS as 


EET 
by Haggerty** in his definition of Art, which 
follows: 


hoo! 


The outward activities and inward d 
periences that are called art are the me 
forts of human beings to make life qe 
interesting and more pleasing. Art cdd 
jects which are the product of these 8. 
tivities and experiences are meani 


to the degree that they increase human 
enjoyment, 


; 134-131 
' Education, IX (1939), PP* 


University of Minnesota Press, 1935)- 


i 


December 1953) 


T 
ο ας experiences in Art Education are 
of specific d e than others for the achievement 
y many w ο outcomes has been expressed 
is little e riters in the field. However, there 
tions ο quiin evidence from investiga- 
to show d i art experiences have been compared, 
οἱ art expe inite statistical advantage of one type 
A Quei rience over another. 
ta umi aspect of this investigation 15 
uation, and em was presented in each teaching sit- 
Spond "with the individual had opportunity to re- 
experience his total resources. As the learning 
evant μα were set up, varying amounts of rel- 
Si tations a afer were present in the three 
ences in the increase the possibility of differ- 
on Permit drawings. This system of evalua- 
Wegire 9n ee of achievement for 
Απ au eor on each of three actionthemes. 
Where πο aad of the type of creative expression 
tion Wee οι data is supplied, or of the mo- 
Kus benefits or of the field trip may claim cer- 
he Producti, of his particular type of activity for 
Vestigation lon of drawings with merit. This in- 
lèvemen een out specifically the statistical 
tyssesta μας each of the three art experiences, 
thee Of arte ntages and disadvantages for each 
er investi Xperience, and offers areas for fur- 
igation on this and related problems. 


SECTION Π 


DEs 
IGN OF THE INVESTIGATION 


A, 
Sources of pata. 


. THE 
investigation OUP making the drawings for this 
Se in unde consisted of 60 University of wis- 
σας aie akmenis enrolled in three 
The Depart course offered in the Art Educa~ 
Une course ae entitled 50a (Basic Drawing): 
preversity b escription for 50a, 35 stated in the 
sti Ciples ae is as follows: «plementary 
tio l life and black and white media in study ΟἹ 
oo ατα, Pree and composi- 
as dents wh course 50a is a required course for 
othe SPecial. have Art Education or Applied Arts 
ST fiel, field. For students specializing in 
Serj Pleme 50a is an elective course. 
of ibe the ntary information that helps to de- 
ead COmbin roup was collected through the use 
i Bi data sheet and questionnaire for 
s wis eres (A summary of the findings 
5 y, Section Nr is given in the section o? ind- 
Om " 
the’ Where Ss rosters of the three sections of 
9n is er Wan | names of students appeared in 
numpi οἳ th they had signed up for the course, 
Sted e three rosters the students W ere 
Onsecutively as their names appeared. 


LOOMER e 


Inthis manner, each class Was divided three ways 
and three groups Were organized. An individual i 
assigned to a specific group, did all three draw- 
ing exercises in that group. The three groups re- 
sulting from this process were designated as fol- 
lows; Control Group, Experimental Group 1, and 
Experimental Group 2. 

Because of the manner in which the individuals 
were assigned to groups, plus the fact that each 
class was divided three ways, the assumption is 
made that this represents a random sample of the 
population that makes up the three sections of 50a. 
General makeup and characteristics of this group 
would allow the experimental findings to be applied 
widely to other groups. 

It was thought best to construct a few 8 imple 
statements to act as criteria for the drawings. The 

ria of this kind are two: 1) 


reasons for having crite 
nal instructions that are given, 


these simple statements will more clearly indicate 


to each student the purpose and direction of this 


study; 2) because of the individual criteria to be 
used by each judge in the evaluation process, these 
initial criteria will serve as a point of departure 
from which judgment can be made. 

The source, process of refinement, πὰ valida- 
tion for these criteria for the drawings is aS fol- 


lows: 1) Published materials in art education were 
luding books, magazines, and period- 
icals, with special emphasis upon works most re~ 

i It seemed desirable to have 
that would be readily understood, 
i to this problem. 2) There- 
esented at a meeting where 


f the three drawing classes were 


lization in question 3) The three statements 
e accepted as valid for purposes of 
bers of the evalu- 
inary meeting, held two weeks 
ion of drawings was made. 
4) Comparison of the three statements of criteria 
with the individual statements of criteria submitted 
py the five judges reveal no sharp conflicts. 
Two weeks before the first evaluation of draw- 
was made the judges agreed to submit a series 
that would outline each judge's con- 
titutes merit in a drawing?' Since 
p do this in individual confer- 
ences, the jnvestigator held individual conferences 
with each judge, wrote up the statements of criteria 
d at in the interview, and submitted this to 
each judge for possible changes or revisions. After 
changes and revisions, the individual 
judges were in final form. While 


nvestigation, the things most com- 
essed as qualities of a good drawing (men- 


monly Str j 
tioned by two or more judges) appear to be: sound 


68 JOURNAL OF EXPERIMENTAL EDUCATION 


organization, individual technique, interesting 
presentation of elements, careful rendering, 
space filling quality, ideas expressed readily, 
clear and articulate expression, sincere Subjec- 
tive presentation. Things that evidently detract 
from the merit of a drawing, according to these 
five judges, are: the inclusion of meaningless 
material, inadequate understanding of subject 
matter, inadequate grasp of line and perspec- 
tive, lack of strength of line, evidence of im- 
posed technique, and lack of total integrity. In 
listing the individual criteria, and throughout 


this report, the five judges will be identifiedas 
A, B, C, D, and E. 


Personal Criteria of Judge A 


[m 


- The essential quality of the drawing, or, de- 
Sign for its abstract quality. 


2. Sound organization or Structure, with ele- 


co 
[^] 
o 
1 
E: 
3 
ua 
g 
5 
© 
E 
5 
[3 
" 
ü 
© 
5 
5 
EI 
E 
H 
o 
Q 
dg 
E 
S 
1 


eo 
[^] 
g 
E 
B 
Β 
Lm 
a 
oO 
nm 
B 
E 


cates that the student 
h no imposed technique 


Personal Criteria of Judge B 


m 


. Ease of pictorial Structure 


a. The drawing Should be designed for the 
whole space, 


. Ideas visualized readily 

a. Ideas presented should not Obviously be 
from Second-hand Sources, 

b. Direct experience results in richer visual 
Conception in a drawing, than vicarious 
experience, 

3. Ease of representation of objects 
a. There should be evidence of individual in- 
ventive Capacity in the drawing. 

b. Pictorial.organization is one of the most 
difficult things to learn, but is important 
in the evaluation of a drawing. 


N 


Personal Criteria of Judge C 


1. Expression should be clear and articulate, 
with a distinct trans 


2. Drawing should show a 


iums as well as control of technique, 
3. Essential elements 


subject. 
. Subjective presentation Should be Sincere, 
- Accuracy of perception is important in the 

drawing process and a clear understanding 


C us 


(Vol. 22 


of natural form provides a Strong resource for 
imaginative pictorial construction. . 

6. It e cose to consider the total integrity of 
the picture, hence, it is important to develop 
certain devices to bring about a resolution of 
the pictorial forces, giving a sense of complete- 
ness, even if just by implication. 


Personal Criteria of Judge D 


1. All elements of a drawing should function. 

2. Since the passive or undesirable detracts from 
the total merit of a drawing, the drawing should 
have no meaningless or irrelevant material. f 

3. All elements of the drawing should have clarity 
of presentation. . 

4. Recognizability with precision is an important 
quality, although distortion (as long as the pre- 
ceding is retained) is the prerogative of the 
artist. 

5. The drawing should present ideas in such a way 
that there is no groping on the part of the alert 
observer, 

6. Subject matter of a drawing should be presented 
ina manner which indicates that the subject mat 
ter has complete meaning for, or is thoroughly 
understood by the individual. 


Personal Criteria of J udge E 


1. Form 


8. Treatment of the drawing as a unit through 
a sound overall design, balance, and care- 
ful placement of objects. 
b. A quality of line which gets the most out of 
the medium used, and a variety of textures. 
€. Evidence of a correct knowledge and wise 
use of linear and atmospheric perspective. 
2. Concept " 
8. An individual concept, and a complete aware 
ness of the concept. 
3. Sensibility 
a. A feeling put across by the drawing that in- 
dicates a personal experience with, and a 
personal feeling about the subject portrayed. 


Drawing materials used in the investigation in- 
Cluded soft black Pencils (General Carbo-Weld Ex 
ercise, no. 931) and good quality heavy white draw 


ing paper, size 18 x 24 inches. Each student fur- 
nished his own drawing board, 


B. Pro 


cedure in Collecti Data 
5 —2XOllecting Data 


Descriptions of Three es of Learning Exercises 


Control Group, ~In the classroom environment 
this grou i 


br 30 Even the acti d allowed 
to draw for 30 {ο 45 action theme, an 


minutes, then the drawings 
ών Thethree sean themes were as 
Ollows: 


ET 


19» 


E^ 


December 1953) 


“Fire Engines Going to a Fire" 
** Activity in the Lion Cage"? 
**Active Children in a Country School" 


_ Experimental Group 1. —In the classroom en- 
vironment this group was given the action theme, 
Shown a sound color film that featured subject 
matter in varying amounts pertaining to the ac- 
tion theme, then allowed to draw immediately 
after seeing the film, for 30 to 45 minutes, then 
the drawings were collected. On theme 1, the 
film “Fire Engines" was shown. This film has 
the locale of a big city fire station, shows the 
routine life of a fireman working around the fire 
engines. The action of each division of the fire 
company is portrayed, when the alarm is sound- 
€d and they go to the fire. There is quitea strong 
emotional impact from the nature of the subject 
matter, and the vivid action portrayed. Therun- 
ning time on this film is 11 minutes. On theme 
2 the film **The Zoo" was shown, also an 11 
minute film. Although many animals are featured 
a large part of the film is concerned with lions. 
While this film does not have as much emotional 
impact as the previous one, the subject matter 
15 interesting, and has appeal to human nature. 
On theme 3 the film ‘Schoolhouse in the Red"? 
there is an emotional effect, due to the strong 
Story concerned with consolidation of rural schools. 
However, of more immediate use for a drawing 
On this theme is the part which shows a smallone 
room country school with typical action in a nd 
Outside the school. The running time on this film 
is 45 minutes. 

Experimental Group 2. —After assembling at 
ἃ prearranged starting point the students were 
taken by cars to the particular field trip destina- 
tion. Upon arriving at the field trip destination 
the students were given the action theme thatthey 
Would interpret ina drawing. Each field trip sit- 
uation was different, as explained in a brief dis- 
Cussion of each that follows, but in each field sit- 
uation students were allowed to draw from 30 to 
minutes in that environment, and the drawings 
Collected, The field trip concerned with theme 
1 was to the Main Fire Station, Madison, Wiscon- 
Sin. A captain in the fire department met the 
group, and led a conducted tour, lasting about 30 
Domin, through the fire station. Verbal descrip- 
ions, and actual demonstrations were included, 
With opportunity for questions by the students. 
fe τς Students were then allowed free run of the 
wi lon to select whatever subject matter they 
iced to include in the drawings. The field 
hs "v Concerned with theme 2 was to the lion house 
ES the Vilas Park Zoo, Madison, Wisconsin. A 
ens attendant was present to point out charac" 
ee and eating habits of the lions, and to 
ata QT any questions. The field trip was held 
acti ime when the lions were inclined to be more 
Ve, and moving around. The students could 


LOOMER 69 


use whatever sensory material they found in that 
field trip situation, in their drawings. The field 
trip concerned with theme 3 was taken to the Cat- 
fish RuralSchool, Dane County, Wisconsin. This 
is a large one-room rural school located about 
six miles north of Madison, just off highway 113. 
Students were taken to this school while the school 
was in session, allowed to observe, and make 
drawings there while the school activities were go- 
ing on. Part of the school room activity included 
organized classwork, part of it included recess, 
where some of the pupils chose to watch the draw- 
ings being made. The school teacher was available 
for questions, and the students talked freely with 
her. Again, the students were free to use whatever 
sensory materials they wished, in their drawings. 


General Instructions to All Students Making Draw- 
ings 


Since three classes were used, the general in- 
structions had to be given three times, in each of 
three classroom situations. After the investigator 
had been introduced, the title and purpose of the 
investigation were given. The setup of three par- 
allel groups was explained, along with explanation 
and example of the type of thing done in each group. 
That assignment to groups would be done on a ran- 
dom, impartial basis was brought out, besides the 
fact that once assigned, a person would make three 
drawings in that same group. The three statements 
that were to serve as criteria for the drawings were 
presented, with opportunity for questions on any 
part of the explanation. The fact that the drawings 
would be put together for the evaluation was pointed 
out, and that the evaluation jury consisted of five 
college art teachers, each using his own criteria 
inthe process. The students were informed tha t 
they would not be graded on this, except as their 
work counted toward daily class work. The stu- 
dents were informed as to the type of materials 
that would be used, that except for supplying their 
own drawing boards, all materials would be furn- 
ished. An attempt was made to keep all explana- 
tions to all three classes as uniform as possible; 
however, the above information was regarded as 
essential, so none was omitted. 


Evaluation of Drawings 


After all the drawings had been collected they 
were put into three groups according to the action 
theme, making 60 drawings each on three themes. 
Since the students' names were on the back, an 
identifying three digit number was the only identi- 
fying mark on the front side. This number was 
placed in the upper right hand corner, in black in- 
dia ink, with the number 3/4 inches high. Since 
these numbers were not assigned in any logical 
sequence to represent groups, when displayed they 
ran clockwise around the room, from the smallest 


πο JOURNAL OF EXPERIMENTAL EDUCATION 


largest number. 
i Es the first evaluation, three sets of 60 
drawings were hung in two rows, at near eye 
level in three unused classrooms, one large, and 
two smaller rooms. Prior to hanging the draw- 
ings, all windows had been covered with brown 
paper (to admit no light, since evaluation was 
done in the daylight hours) and the electric light 
was equally strong on all drawings. Each judge 
worked independently, with the help of the inves- 
tigator to mark his choices as they were made. 
One complete theme was judged at a time, with 
one evaluation sheet for each of the three action 
themes. 

Reference is made here to the sample evalu- 
ation sheet, or Scoring sheet, in the appendix, * 
and included here. This evaluation sheet has 
Spaces in which numbers of individual drawings 
are to be placed, On the left hand side of the 


cluding spaces for 15 numbers. The caption 
under each block is Self-explanatory, but by fill- 
ing every Space the number of every drawing of 
The spaces on the 
right hand side are for the ten best drawings in 


on this sheet for the convenience of the judges. 


There was no Specified order in which this eval- 
uation sheet was 


best 15 drawings were Selected 
marked down in the 
Small colored identi 
corner of the 15 selected drawings. 
15 marked drawings, 
Selected, the numbers put in the two spaces in 
the upper right, 3) From the remaining 13 marked 
drawings the 8 best were selected, numbers 
marked in the block of 8 Spaces in the upper 
right. 4) To the other extreme, the 15 poorest 
drawings of the entire 60 were selected, numbers 
marked in the spaces of the ‘fourth quartile’, 

and these 15 drawings marked with a different 
colored tag, for identification. 5) From the 15 
drawings marked by the 2nd color, the two poor- 
est drawings were selected, numbers marked in 
the spaces for the two poorest in the lower right. 
6) From the remaining 13 drawings in this group 
the 8 poorest were selected, numbers put in the 
block of 8 spaces in the lower right, 7) From the 
remaining 30 unmarked drawings, the 15 best 
were selected, and the numbers marked down in 
the ‘second quartile’ block of 15 Spaces. 8) Since 
all 60 are to be represented on this Sheet, the 


3* All references to Appendices may 
Wisconsin, Madison, Wisconsin. 


be found in original thesis on file in Library. 
> 


(Vol. 22 


investigator was able to fill in the 15 spaces re- 
maining, in the ‘third quartile’ block. After one 
complete action theme had been judged, all colored 
tags were removed, before the next judge began 
his evaluation. After all five judges had evaluated 
the three action themes of 60 drawings each, the 
drawings were taken down. 

At the time of the first evaluation of all draw- 
ings (180) the judges knew that they would be asked 
to make another evaluation of one complete action 
theme (60 drawings) after a two month time lapse, 
but the judges did not know which of the three ac- 
tion themes would be the one selected for reevalu- 
ation. By a system of random Selection, theme 1 
(“Fire Engines Going to a Fire’’) was the one se- 
lected for the second evaluation, Before these 
drawings were hung for the second evaluation the 
numbers were changed to two digit numbers, 3/4 
inches and located in the upper right corner. The 
reason for changing the numbers was that if visual 
memory persisted from one evaluation to the next, 
it would be of the drawings themselves, and not in 
the way they were identified by number. 

For the second evaluation, two months after the 
first evaluation was made, the drawings were dis- 
played in the same large classroom that had been 
used previously. The same precautions were taken 
here to insure adequate even lighting on all draw- 
ings, as had been taken in the first evaluation, and 
the drawings were hung at near eye level, intwo 
rows. Since the numbers assigned in this second 
instance had been done with no logical sequence to 
represent groups, when displayed, they ran clock- 
wise around the room, from the smallest to the 
largest number, J udges worked independently as 
before, with the investigator present to put tags on 
the drawings as they were selected. The same 
Score sheet, or evaluation Sheet, was used here, 
85 in the first evaluation, with same procedure of 
Selection that has been described. After each judge 


finished his evaluation of the 60 drawings on this 
action theme, the ident 


to eliminate the possib; 
ments influencing ano: 


through 40 the Experim 
60 the Experimenta] Gr 


University of 


At. 


December 1953) LOOMER τι 


Evaluation of Drawings 


NES 


From 1st Quartile, The Two Best 


First Quartile - The Best Fifteen Drawings 
(No particular order in this group) 
From Remaining 13 of the 1st Quartile 
Select the Eight Best Drawings 
(No particular order in this group) 


The following questions were pre- 
sented as criteria to act as directional 
guides for individual drawings: 

1. Are the forms clearly presented? 
2. Does the drawing convey a message? 


F y 3. Do all parts function to give unity 
Second Quartile - 2nd Best Fifteen Drawings to the drawing? 


Third Quartile - 3rd Best Fifteen Drawings 


κ. a 


From 4th Quartile, The Two Poorest 


Fourth Quartile - 4th Best Fifteen Drawings 


From Remaining 13 of the 4th Quartile 
Select the eight Poorest Drawings. 


πο JOURNAL ΟΕ EXPERIMENTAL EDUCATION 


With a total of 60 drawings on each action 
theme, division of the 60 into quartiles gives an 
even 15 in each quartile. With the evaluation 
sheet used, it may be obvious that the numbers 
arrange themselves from one extreme to the 
other as follows: 2, 8, 20, 20, 8, 2, whichtotal 
60, but follow the normal probability curve. Since 
the score sheet, or evaluation sheet is set up on 
this basis, it is a simple matter to assignscores 
1 through 6 on the basis of these 6 levels. Ac- 
cordiug to this plan, only two scores out of the 
60 can have the value of 6, the highest score ; 
only two scores can have the value of 1, the low- 
est score, etc. From the twenty evaluation 
Sheets which numbered the drawings from 1 
through 60, the table of raw data was compiled 
as a summary of all evaluations made by the 
judges. It may be pointed out here that the 2, 8, 
20, 20, 8, 2 correspond roughly to the following 
percents that go to make up the normal probabil- 
ity curve, 4, 13, 33, 33, 13, 4, totalling 100%. 
Although there are six resulting levels, these 
are in no way meant to represent grades, but 
merely levels of performance. Whena jury has 
to select 20 paintings to hang in a given space 
from 100 submitted, the process is simply one 
of selection of the best 20, for example. In a 
parallel situation, when a judge has to select the 
15 best drawings from the total number of 60, it 


eliminates the other 45 from consideration at 
that stage, 


C. Treatment of Data 
Controls were em 


ployed in the investigation 
as follows: 


1. Each person in these three drawing classes 
had the same chance for random assign- 
ment to each of the three groups. 

2, Regardless of group each person had the 
same three action themes. 


- Regardless of group each person had the 
Same time limit on drawings. 


rs 


. Regardless of group, each person used 
the same materials, black drawing pencil, 
and white drawing paper. 


5. Regardless of group, each person had the 
same general instructions and explanation. 


6. Since each of the three classes were divid- 
ed three ways, the differences between 
classes was minimized, 


7. Regardless of group, each person had the 
same three criteria for the drawings. 


(Vol. 22 


8. The same five judges made evaluations on 
the drawings of all the students in the in- 
vestigation. 


9. The information requested on the individual 
questionnaire and data was the same for all. 


10. Regardless of group, each person had to 
make drawings on three different days, and 
time allotment was the same for all. 


The process of evaluation assigns a level of 


achievement to each drawing, on every action theme. 


As discussed in the description of the procedure in 
Part B, just preceding this part, the numbers of 
Specific drawings were transposed into numbers on 
the scale 1 through 6, with 6 being the high end of 
the scale. The resulting table of raw data was used 
for the basis of all statistical computation by which 
comparisons and contrasts are shown. This table 
of raw data is located in the appendix (found in the 
original thesis), and will be referred to often in 

the discussion of how the data was treated. 

At an early stage the analysis of variance tech- 
nique was employed using the means of the three 
groups on each of three action themes. Intable 
form, three means can be averaged on each group, 
and a grand mean calculated. Using the formula 
found on page 336 of Alexander M. Mood's Intro- 
duction to the Theory of Statistics, the analysis of 
variance problem was worked out. This formula 
is given on the next page. 

The question to be answered in this process was, 
**Does the among variance exceed the within vari- 5 
ance by an amount that is statistically significant? 

At another stage, *«Students T” scores were 
calculated to test the significance of differences 
between the mean achievement of various groups 
under different conditions. ‘Students T” scores 
were worked out for the second evaluation of theme 
1, in addition to the regular evaluations of themes 
1, 2, and 3. To determine what the probabilities 
that the differences were large enough to have οὐ” 
cured by chance, the table on distribution of the 
“Students T^ was consulted. By this method, the 
level of probability for each score was determined. 

As a check on the calculations concerned with 
the *'Students T” critical ratios were calculated on 
the second evaluation of theme 1, besides the reg” 
ular evaluations of themes 1, 2, and3. The form” 
ula used in calculation of “Students T” was as 
follows: 


X, -Xə 


[sist 2482 
N-1 
The formula used in t 


: he calculation of the critical 
ratios was as follows 


Diff, (means) 
o Diff, 


m 


December 1953) 


LOOMER 73 


Degrees of upo 
Somce Beets  Συθαίσι Mesure _ __ = 
Row ZG-X?-S r-1 Si - 5 
meg Gi x) 1 mu δι ΕΝ 
Column ee es c-1 S2 . S. 
ms σι x)? = S52 c-1 8: ΕΝ 
Interacti ms (δι - Xj -Xj + X)? =S r-1(c-1 — e 55. 
Sx garagen Νε} g-ne6-9 5 
iati spre ο 5 RS 
Deviations ἃ (κηκ- Xij)? = δι rc(m - 1) mdp 8, 
Total Z (Xijk -x)? rem ~ 1 
ijk 
k = groups, m= cases, i= individuals, j = columns 


The level of probability of critical ratios was de- 
termined, as had been done for the «Students T” 
score, 

A further check on the differences in achieve- 
ment of the three different groups was made by 
calculation of critical ratios, using the Sa me 
formula for critical ratio as shown above. The 
level of probability for these critical ratios was 
determined. As a method of checking this work, 

‘Students T” scores were calculated by the use 
of the same formula for “Students T” as shown 
above, 

_ The extent of agreement of judges was stud- 
led by calculation of the coefficient of correla- 
tion of every judge with every other judge on each 
ΟΕ three action themes in the regular evaluation. 
As a result of this process, 30 coefficients of 
Correlation were calculated. This also made 


Possible the calculation of a median coefficient 
of correlation for all judges on each of three ac^ 


tion themes. 

The extent of reliability of evaluation wa d 
termined by two methods: the first established 
the extent of agreement between two judgments 
of one person, as expressed in à coefficient of 
Correlation for each of three groups. The form 
va used in the calculation of these coefficients 

correlation was as follows: 


-(Ex- 


ο αντ Cx: Y) — 


ts of correla” 


L A n T 
n the first instance five coefficien 
In the secon 


tion resulted, one for each judge. 
ος Ἅπας, three coefficients of correlation result- 
Fic one for each group, X, Y, and Z. All coef- 
he uus of correlation were calculated on the 

rst and second evaluations of action theme 1. 


If the reliability of an investigation such as this 
were to be based primarily on the agreement of 
one person’s first and second judgments, this meth- 
od is proposed as one which would increase the 
extent of reliability: Evaluations of the three judges 
having the highest coefficient of correlation be- 
tween their two evaluations were averaged, then 
by the use of the same formula as that used above, 
the coefficient of correlation was calculated be- 
tween the highest and the lowest group of judges. 
(The lowest group in this case would consist of 
two judges, whose two evaluations could also be 
averaged.) The resulting coefficient of correla- 
tion between the two groups would indicate whether 
the results would have been significantly different 
if only three, instead of five, judges would have 
been consulted. 

How the accumulated data has been treated has 
been indicated. The findings from the treatment 
of this data will be listed in Part A of Section III; 
Part B of Section III will describe the group, ac- 
cording to the results of the individual data Sheet 
and questionnaire; Part C of Section III will com- 
pare group accomplishment, and through the dis- 
cussion will interpret the findings to the end that 
the findings support the inferences made, and 
lead to the conclusions of the investigation. 


SECTION III 


STATISTICAL ANALYSIS OF FINDINGS 


A. Statistical Findings 


THE RAW data gathered from sources 
indicated in Part A of Section II, and treated as 
described in Part C of Section II will be present- 
ed in this section under the following headings: 
1) Analysis of Variance, 2) Variance Between 


14 


Groups, 3) Group Accomplishment by “Students 
T” and Critical Ratio, 4) Agreement of 1 udges, 
5) Agreement Between ‘Most Reliable’ and ‘Least 
Reliable’ Judges, 6) Reliability of Performance 
by Judges, and 7) Reliability of Performance by 


1. Analysis of Variance: 


Sum of 


Degrees of Mean 
Source Squares Freedom Square *F Test" 
Row* 4.84 2 2. 420 4, 21933 X 
Column** .00 2 «00 «00 
Interaction 2.044 4 .511 .890 
Deviation 98.21 171 «514 s 
Total 105.054 "Ξε. ww 
* This represents the three different groups: Control, Experimental 1, and 
Experimental 2. 
** This represents the three action themes, 1, 2, and 3. 
***In the “F Test’? this amount is significant at the . 05 level. 
2. Variance Between Groups: 
Critical Ratio “Students T” 
X to Y (Control to Experimental 1) 1.76 1.76 
X to Z (Control to Experimental 2) 1.27 1.28 
Y to Z (Experimental 1to Experimental 2) 2.84* 2.84 
*Significant at the . 01 level, 
3. Comparison of Group Accomplishment: 
T Ν : Critical E 
omparison ‘Students T^ Ratio Direction 
Action Theme 1 XtoY .567 .567 x 
XtoZ . 088 . 088 Z 
Ytoz «102 «102 Z 
Action Theme 1a X to Y . 707 «ποτ x 
X to Z «158 «160 2 
ΥἱοΖ 1. 439 1.444 Z 
Action Theme 2 XtoY . 981 . 981 x 
X to Z 2.115* 2.105 2: 
Tioz 2.911** 2.911 Z 
[ 
Action Theme 3 XtoY 1.509 1.500 x 1 
X to Z . 078 . 077 Z 
YtozZ 1.317 1.317 Z 
* Significant at the . 05 level, 


**Significant at the . 01 level, 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. 22 


Groups. In some instances the analysis has re- 
sulted in the findings presented here; in some 
instances additional comment is included in this 
Section, and in Part C of this section to further 
analyze and interpret the findings. 


December 1953) 


4. Agreement of Judges: 


LOOMER 15 


Judges agree to the extent shown by the following coefficients of correlation, on evaluation of all 60 


drawings of the three action themes: 


Judge Theme 1 
AB .24 
AC .32 
AD .48 
AE sit 
BC .44 
BD .42 
BE vi 
CD .42 
CE i 
DE .27 


Median scores for the three themes are 88 
follows: 


Theme 1 .295 (Interpolated) 
Theme 2 .305 (Interpolated) 
Theme 3 . 44 (Interpolated) 


5. Agreement Between ‘Most Reliable’ and ‘Least 


Reliable’ Judges: 


On the basis of each judge’s coefficient of cor- 
relation between his two evaluations (shown in 
Part 6 immediately following), the five judges 
Were assigned to two groups: Most reliable, the 
Ones with the three highest coefficients of correl- 
ation between their own two evaluations of action 
theme 1, and Least reliable, the two with the low- 
est coefficients of correlation. The evaluations 
are averaged for each group, and the coefficient 
of correlation between groups calculated using 

e same formula as before. Group ACD to BE 
Save a coefficient of . 46. 


8. Reliability of Performance by Judges: 


action 
stime 
e judge, 
e cal- 


th On two evaluations of one complete 

eme (60 drawings), with a two month 

rend between evaluations made by the sam 

au following coefficients of correlation wer 
ulated: 


Judge A . 72 
Judge B . 56 
Judge C .83 
Judge D .80 
Judge E gal 


7 TT 
- Reliability of Performance by Groups: 


Bat? one complete action theme (60 drawings) 
Was evaluated twice, with a two months time 


Theme 2 Theme 3 
.52 15 
.48 32 
.30 .58 

31 .54 
59 .66 
18 .42 
30 «95 
28 «46 
39 «34 
15 .65 


lapse, the ratings of five judges were averaged 
and the coefficients of correlation were calcula- 
ted for the three groups X, Y, and Z. 


X to Χ' (Control Group) . 76 
Y to Y' (Experimental Group 1) .90 
Z to Z' (Experimental Group 2) .91 


B. Summary of Individual Data Sheet and Ques- 


tionnaire 


Reference is made here to the sample individ- 
ual data sheet and questionnaire in the Appendix 
and found onpage 79 of this study. The purpose 
of this data sheet was to supply important infor- 
mation on each student relative to this background, 
training, experience, and co-curricular activities; 
to provide information about the extent of contact 
with subject matter of the action themes used in 
this investigation; and to allow each student the 
opportunity to express his opinion about the out- 
comes of this study, and to indicate his interests 
in particular aspects of the material used. 

Since these data sheets were filled out in the 
classroom after all drawings had been completed, 
the students were in a position to express them- 
selves. Students were encouraged to add any ad- 
ditional comment that they wished, which many 
chose to do. There was 100% return on these 
data sheets, so that a complete account is avail- 
able. 

The summary of all compiled information is 
shown in table form on the following pages, the 
total information is listed in the first column. 
with the breakdown according to the three groups 
that featured different art experiences. 

With opportunity to describe one's own style 
of drawing, the most common descriptive terms 
include such things as: ‘none developed yet, ’ 
sketchy, exact, loose, detailed, childish, inde- 
scribable, cartoon, coarse lines, *don't know, ' 


76 JOURNAL OF EXPERIMENTAL EDUCATION 


Classification Freshmen 
Sophomores 
Juniors 
Seniors 


Home Address Madison, Wisconsin 
Within Wisconsin 
Out of State 

Subject Area Applied Art 


Art Education 
Other Subject Areas 


Number of College Art 


Courses 1 
2 

3 

4 

5 

Semesters of High 

School Art 0 
1 

2 

8 

4 

5 

6 

m 

8 

12 


Number Having 1 year or less 
Private Art 1 - 2 years 
Instruction Over 2 years 


Number of Visits to 0 
Fire Station Since 1 
Starting High School 2 
3 
5 

10 


Number Who Have Never 
Visited a Fire Station 


Number of Visits to Lion 
House of a Zoo Since 
Starting High School 


NSQupwnro 


SUMMARY TABLE 


22 


=m 
Ow moa c loco 


Control 


10 


NAA 


m 
ONcowoo 


92) ο Ο η» 9 η» Ο) ο 


ο πο αλ ον. 


O O t bt co o c 


(Vol. 22 
Experi- Experi- 
mental mental 

1 2 
16 9 
4 6 
0 3 
0 2 
4 4 
12 11 
4 5 
5 3 
7 4 
8 13 
8 11 
1 3 
7 3 
2 3 
2 0 
6 12 
1 1 
6 2 
1 4 
3 0 
0 0 
1 0 
0 0 
1 0 
1 1 
1 0 
8 1 
1 0 
4 4 
9 8 
2 2 
0 2 
1 0 
1 1 
0 0 
10 3 
1 1 
3 2 
4 5 
4 3 
1 2 
0 0 
0 2 
0 0 


I 


"vw 


December 1953) 


8 
10 
12 
40 
200 
Number Who Have 
Never Visited the 
Lion House at a Zoo 
Number of Persons 
Who have Attended α 
Country School 
Years Attended in a 
Country School 1/2 
2 
5 
6 
8 


Opinion of 45 Not adequate 
Minutes as Ο.Κ. for Sketch 


a Time Plenty of Time 

Limit? No Opinion 
Group Pre- 

ferred if Control Group 


Study were Experimental Group 1 
Repeated? Experimental Group 2 


Which Group Control Group 
Do You Think Experimental Group 1 
Excelled Experimental Group 2 
No opinion 


Reasons Given for Specific 

Group Preferred? 

A. Control Group 
Better to work from imagination — 
Favors thinking out art organization 
Better for use of originality 
Better for creativeness 
More freedom in drawing 
Enjoys using imagination here 
Favors individual interpretation 
Easier, not so much to do 


- Experimental Group 1 


Presents most ideas, novel nar 

Time saving, no running aroun " 
Experimentation, compare resu 

Easy way to get ideas . . 
Interesting, permits use of imagination 
Movies are entertaining 


LOOMER 


Total 


5; 


BRP RR Re a 


eS) 


Control 


[ο ο χω. κα] 


ÍÓommmmnon 


oroocjor 


oroooo°or 


μπω) 
E 


Oreo 


oooooorn 


ooo κο 


77 


78 JOURNAL OF EXPERIMENTAL EDUCATION 


C. Experimental Group 2 
More interesting from real life 


Something definite to look at 
Gives needed information here 
Easier to draw from actual object 
No need to rely on memory 
Get around more to new locations 
Practice valuable on the spot 
Get better action here 
Get more feeling here 

No opinion 


On Which Theme 1 (Fire Engines) 


Do You Think 2 (Lion) 
You Made the 3 (School) 
Best Drawing? No Opinion 


Reasons Why You Think 
that (Above) Drawing 
is Best? 


More interested 

No reason 

Like action theme 

More acquainted 

Easier to draw 

More unity, organization 
Flexible, free 

More action 

More humor 

More use of imagination 
More value and pattern 
Less complex 

More relaxation 
Fulfilled assignment 


(Vol. 22 
Experi- Experi- 
mental mental 

Total Control 1 2 
12 2 1 9 
8 8 4 1 
10 1 8 1 
4 0 2 2 
3 3 0 0 
1 0 0 1 
1 0 0 1 
1 1 0 0 
1 1 0 0 
1 1 0 0 
13 3 4 6 
18 5 6 7 
28 11 10 7 
1 1 0 0 
14 4 1 9 
9 5 3 1 
8 3 3 2 
8 3 5 0 
7 0 3 4 
3 1 2 0 
3 0 2 1 
9 0 0 2 
1 1 0 0 
1 1 0 0 
1 1 0 0 
1 1 0 0 
1 0 1 0 
1 0 0 1 


or ‘enjoy expression. " 
to this item ina manne 
expected this art cour 
‘develop’ a style in th 
Reference to the res 
of the individual da 
to in Part C that fo 
accomplishment. 


C. Analysis of Group Accomplishment 


Where group accomplishment was compared 
on each of the three action themes, plus the sec- 
ond evaluation of theme 1, the accomplishment 
of Experimental Group 2 (Z group, or the Field 
Trip group) was greater in every instance when 
compared with the other two groups. When com- 


Many students responded 
r that indicated that they 

Se and others to help them 
€ course of their education. 
ults obtained through the use 
ta sheet will also be referred 

llows, in the analysis of group 


pared with the control group (X group, or Creat- 
ing on a theme with no provided sensory mater- " 
ial) the Z group had the advantage in allfour e 
parisons, but in no instance was the amount groas 
enough to be statistically significant. When com 
pared with Experimental Group 1 (sound-colòr " 
movie group, Y group) the Z group had the advan 
tage in all four comparisons, but only on theme 

2 was the amount statistically significant, on the 
«01 level. When groups X and Y were compared, 
group X had the advantage in all four compari- 
Sons, but only on theme 2 was the amount statis- 
tically Significant, on the . 05 level. 

When the reliability of performance by groups 
was checked on 60 drawings of theme 1, which 
was judged twice by five judges, there were slight 
differences in the Stability of groups, Z group 


y^ 


Decemb 
er 1953) LOOMER 


Individual Data Sheet and Questionnaire 


Name 
Classification Fr.___ Soph: jr Sr 


Major Subject Area 


Home Address 

questionnaire, answering every item 
tigator had no official relationship to 
y. When completed, hand this in 


Please fill out this data sheet and 
to the best of your ability. The inves 
any teacher who participated in this stud 
to your instructor. 


Writ 
e names and course numbers of all art courses you have had on the college level: 


E un eia len cs 
———— 


Ho 
w many semesters of art instruction did you have in high school? 


struction? (check one) Yes Νο 


Ha 
i ve you had any private art in 
Yes, (above) explain: 


Li 
rien here are the three action «Activity in the Lion Cage. "' 
mes used in this experiment: «pire Engines Going to a Fire. n 
«Active Children in à Country School. ’7 


1. Check the theme on which you think you made the best drawing (above). 


Why? po” as 
No 


he lion house or zoo? Yes 
you started high school? 


2. Have you ever visited t 


Approximately how many times since 


ire station? Yes No 


3. Have you ever visited a f 
you started high school? 


Approximately how many times since 
Νο 


4. Have you ever attended a country school? Yes 


If yes, for how many years? 


Listed here are the three a. “Control Group”? (Drew on the theme only) 
ον used in the — p. «Experimental Group i" (Drew after sound color film 
Xperiment: c. Experimental Group 2" (Drew while on a field trip) 


5. Check the group in which you did your drawing (above). 


5 minutes as 8 time limit? 


6. What do you think of 4 
the study were repeated? a. b. e. 


7. Which group would you prefer if 


Why? (above) 
in final evaluation? a. b. A 


8. As a group, which group will do best 
particular tstyle” or manner of drawing? 


9. How would you describe your 


80 JOURNAL OF EXPERIMENTAL EDUCATION 


having . 01 higher reliability coefficient than Y 
group, and .15 higher than X group. While the 
difference between Y and Z is negligible, the in- 
dication toward instability by X group may be be- 
cause the positive qualities such as originality, 
creativeness, Spontaneity, and freedom of expres- 
Sion believed by many to accompany the type of 
art experience to which group X had exposure, 
appear to ‘scatter’ that group in a situation where 


X, and the greatest gain was made by group Z 
over group Y, .91. Some drawings have quali- 
ties that ‘wear well? while others may have ap- 
peal that is fleeting, Perhaps the art experience 
has a contribution to make in determining the 
‘lasting’ qualities of a drawing created in that 
environment, 


with groups Υ and Z about the same, 
Subject matter of theme 3, 
have quite a definite advan 


years; and group X seem, 


5 people who have had private art instru 


; &£roup Y had al- 
most two times as much, and group X almost 


three times as much as that had by group Ζ. 

In summary, analyses of the findings show 
that there is a substantial difference in perform- 
ance by groups. Where groups are compared Z 
group is superior in every instance to X and Y. 
Where X and Y are compared X is superior in 


(Vol. 22 


every instance. Differences in performance are 
statistically significant in some instances at the 

«01 level, in some at the . 05 level, and in some 

instances are not large enough to be statistically 
Significant. 

In the process of evaluation, the findings show 
that the agreement of judges, as shown by coef- 
ficients of correlation range from .11 to . 66, but 
that some judges agree to a greater extent than 
others. As shown by two judgments of the same 
material, with a two month time lapse, coeffic- 
ients of correlation range from . 37 to .80. This 
would indicate that some judges are more reli- 
able than others, and also that judges agree ‘with 
themselves’ more than they agree with eachother. 

No attempt was made in this investigation to 
equate the three &roups on the basis of curricu- 
lar or co-curricular training and experience, or 
past experience with subject matter of the three 
action themes. However, findings from the indi- 
vidual data sheet and questionnaire show a wide 
difference with individuals in the above areas. 
Information supplied through this avenue is valu- 
able in the extension and application of the find- 
ings of this investigation. 

Through an analysis of the drawings them - 
Selves, many of the things brought out in the an- 
alysis of the findings will be graphically illustrat- 
ed. For that reason, a catalog of 36 selected 
drawings, with the Written discussion included, 
Will be presented in Section IV. 


SECTION IV 


COMPARATIVE STUDY OF INDIVIDUAL 
DRAWINGS 


A. Discussion of Drawing Qualities 


FROM THE 180 drawings that were made 
On three action themes, 31 have been chosen for 


more Interesting, 


more creative, and to possess 
more action, mor 


€ Unity and organization, more 
value and pattern, and more use 
Of Course, regardless of the 
describe a set of drawings, not 


qe ' 


December 1953) 


every drawing will be found to be equally strong 
in every quality listed. 

An examination of the numbers represented in 
the nine best drawings will reveal that seven are 
products of the Z group, one of the X group, and 
One of the Y group. It may be noted that student 
482 of Z group had drawings good enough to be 
rated in the top two out of 60 drawings on allthree 
action themes. On theme 1, 2, and 3, student 482 
received ratings of 5, 5, 5, 5, 6; 4, 6, 6, 4, 5; 
and 4, 5, 6, 5, 3, respectively. (A rating of 6 
means one out of the two best in 60 drawings; 5 
= ing a within the next eight best of 60 drawings, 

One important advantage which the field trip 
Would appear to have is illustrated by 482 ontheme 

: The student who made this drawing positioned 
himself to the left rear of the large hook and lad- 
der truck in the fire station. With much of the 
Sensory material there for reference, he built 
an interesting composition on the assigned action 
theme. Quite a contrast in draftsmanship is ev- 
ident in comparing 126 and 458 on the first action 
theme. However, where many qualities go to 
make up a good drawing, 126 apparently is judged 
Superior on such qualities as sincerety of pre- 
Sentation, total organization, or more adequate 

ndling of the action theme. The novel viewpoint 
in 458 apparently has much to do with the success 
9f the drawing, also, since originality and cr eat- 
lveness are important. In 928 the subject matter 
Includes much detailed drawing which requires | 
intense observation by the person in the fieldtrip 
ἐλαίου; however, the detail is not ved Lm 

Come all important. In 745, in the drawing O^ — 
ae two lions, Euh things as simplicity auo 2 ect 

S of expression rently compensate, 

Some extent, for tne eek of : more detailed knowl- 
$08 of the form, and a more sensitive line. n 

» the schoolroom scene, there is a good iilis. 

ation of how certain elements are stressed, an 
Others played down, to achieve a total satisfying 
piect, One device used to direct attention to the 
BE m of the teacher, silhouetted in an action pose, 
8 the convergence of lines toward that ponit. di 
Six p € hine drawings found to be poor est inc : 
τον from group Y, two from group X, and eae d 
ie a group Z. Seven of the nine drawings judge 
i be poorest are shown here. If the negative " 

‘rms from the criteria of the five judges ed = 
t ed, the three poorest drawings on each of ο E. 
8 pus may be described as poor in constant 1 
sec Organization, ideas often obviously fr ehe 
al Ond-hand sources, little evidence of indiv 
in."Ventive capacity, insincere in pr esentation, 
pa Uding irrelevant and meaningless mater m 
iudi in balance, and lacking the variation in 
we texture to make an interesting drawing. _Ț n 
ter may describe the nine poorest arawines p^ 
sh ms used by the students in the individua 

Set, we might say that these drawings were 


LOOMER 81 


lacking in information, showed little imagination 
or originality, lacking in interest, uncertain and 
sketchy, and not enough life. 

From an examination of the numbers in this 
group, it will be noted that 736 appears in the 
poorest two on all of the three themes where 60 
drawings were evaluated. On themes 1, 2, and 
3 in order, 736 rated as follows: 3, 1, 3, ds 3, 
and 1, 1, 1, 3, 3, and2, 1, 1, 2, 3. 

In 217 on theme 1, a novel idea had little effect, 
because of apparent weaknesses such as poor or- 
ganization, scattered elements, and lack of em- 
phasis. In 439 one defect of the control group 
may be illustrated where space is filled with 
broad lines that add little to the interest, and in 
the ‘almost human’ face which may have been be- 
cause certain information was lacking. In 726a 
serious defect of the Z group appears to be the 
unthinking attempt to render sensory information 
with no attempt to organize or unify the elements 
into good composition. The poor showing made 
by two drawings on the second theme, numbers 
138 and 783 shown later, indicate that perhaps 
some allowance should be made in setting up the 
framework for evaluation so that a *too obvious' 
sense of humor would not be penalized when con- 
sidered in with 59 other drawings. 

Some advantages and disadvantages of the X, 
Y, and Z groups have been indicated as possibil- 
ities. It was thought desirable to include a sec- 
tion where the most obvious advantage and dis- 
advantage of each group could be most clearly 
illustrated. 

The advantages of originality, freedom of ex- 
pression, unique imaginative quality, and crea- 
tivity are claimed for the unhampered art exper- 
ience. It may be, as in 485 on theme 3, that 
when a person who has ease and facility of ex- 
pression, along with a large stock of memory 
images, is allowed to concentrate on the drawing 
ina situation where no great amount of sensory 
data is present, a well organized drawing is more 
likely to be forthcoming. On the other hand, lack 
of needed information can often prevent the ele- 
ments of a drawing from fulfilling their function, 
as in 875 on theme 3. Poor use of space is evi~ 
dent in 875, possibly because the needed infor- 
mation was not at hand. In 695 on theme 2, ideas 
which apparently had their origin in the movie 
which was seen just before the drawing was made, 
are worked into the drawing in such a way that it 
becomes a unified whole. In 624 on theme 2, ma- 
terial which also seemed to come from the movie 
still appears to be extraneous, and the design 
disorganized. Much research needs to be done 
to demonstrate the impact of moving pictures on 
our experience, and just how permanent this im- 
pact is, in the way we assimilate the information, 
In 459 on theme 3, there are many elements in 
the drawing which appear to be well drawn and 
generally correct. The clock looks likea clock, 


82 JOURNAL OF EXPERIMENTAL EDUCATION 


the globe like a globe, etc. Even though there 
are such defects as perspective, and proportion 
difficulties, the elements are well placed and ap- 
pear to function. In 745 on theme 3, an apparent 
defect is excessive concern with putting things 
down just as they are, with little thought going 
into the process of making an interesting draw- 
ing. Stiff, formal drawings often result from 
just putting down items at random in a drawing. 

There are many qualities of drawings that in- 
fluence any attempt to evaluate drawings. Some 
qualities are more or less characteristic of the 
stage of development at which these students are 
now. Drawings 126 and 413 were selected as il- 
lustrative of the naive, frank, primitive concept 
of subject matter. Much of the value of this qual- 
ity comes because the drawing has unaffected and 
childlike treatment of the subject matter in away 
that ignores rules and principles of drawing, and 
includes only that essential to the composition. 
These two drawings were selected from thethree 
sets of 60 drawings on the different themes, and 
they both happen to have been made byXgroup; 
however, qualities such as these mentioned above 
are often claimed for the art experience which 
has unhampered expression. Number 413 has in- 
teresting placement of objects, and space arrange- 
ment, however, the thing of special interest here 
is the drawing of various parts of the picture as 
though each were a unique, separate picture, 
viewed from a different angle. Intensity of ex- 
pression sometimes results in crude work, al- 
though the process of weeding out the irrelevant 
results in a simplicity that is an asset to the 
drawing, and not a detriment. 

Althcugh drawings with perspective problems 
would be many ina group of students ina basic 
drawing course, drawing number 138 on the first 
theme is an example of perspective difficulties 
that complicate drawing at this stage. Any per- 
son who has tried to draw several buildings at 
different levels in one drawing, as in 138, can 
appreciate the complexity of the problem. The 
picture plane or ground plane of an object has to 
be visualized in order to make a convincing draw- 
ing. Since drawings such as these are made on 
flat surfaces, there are only two dimensions, and 
yet objects which the observer knows to be solid 
must be made to look solid. 

As has been Suggested, several drawings in- 
cluded a cartoon quality, which had varying de- 
grees of success. Where humor is employed, a 
person has to handle it with care, or the attempt 
may be too obvious. In the cartoon the subject 
is often dealt with by caricature or by allegorical 
symbols. The difficulty a person often runs into, 
in using cartoon technique, as in drawings 738 

and 783 on theme two, is that a person often ac- 
cepts hackneyed or stereotyped devices worked 
out by others, and reaches a ‘plane’ of develop- 
ment at which he is arrested. In the use of sym- 


(Vol. 22 


bols or cliches there is too great an emphasis up- 
on standard forms. 

The story telling quality is present to some 
extent in every drawing. However, group exper- 
ience similar to that undergone by group X inthis 
investigation seems to provoke more than other 
types of experience where there is a greater 
amount of sensory material in the immediate en- 
vironment at the time of the creation of the draw- 
ing. In 204 the title was thought to be necessary, 
and perhaps it is needed. However, the inexper- 
ienced artist is often tempted to put labels on his 
drawings, to express more completely what he 
feels incapable of saying through the medium of 
drawing. ‘Activity’ is certainly the keynote of 
the drawing 217, theme 2. This drawing was one 
where judges tended to differ widely, while others 
of the more representational character were more 
stable. 

The realistic, or illustrational quality, in 
which the observer feels the actuality of the sit- 
uation pictured in the drawing was used in many 
drawings. Numbers 756 and 615 both seem to 
achieve a truthful presentation, yet while the 
drawing is very fine in 615, the total organiza- 
tion is ignored, while in 756, that is considered. 
Details are made plain in both of these drawings 
ina practical way by emphasis and subordination 
at appropriate places. This representationalqual- 
ity is more than mere photographic likeness; a 
camera is too literal; here there must be a pro- 
cess of assimilation of detail, reorganization and 
interpretation, and presentation of new and more 
vivid relationships. 

In summary, because the symbols of expres- 
Sion used in graphic illustration are quite differ- 
ent than those commonly used in verbal expres- 
sion the inclusion of drawings and the written an- 
alysis is felt to be essential in order that the read- 
er may get a well rounded view of important qual- 
ities evident only when the drawings themselves 
may be seen. While we may get agreement on 
Some things thought to be essential to a good 
drawing, terms may often be misleading, or have 
different meanings to different people. Where 
verbal descriptions are used, descriptions must 
be tightly drawn to reduce the danger of double 
meaning or ambiguity. 

Since this investigation does identify a ‘best’ 
group of drawings, and a ‘poorest’ group, it 
would be an interesting extension of this study to 
submit all the drawings in these two groups to 
various art juries, as well as to lay juries for 
evaluation and classification into t 


d | 5 he two groups. 
High ranking drawings do not possess qualities 
thought to be necess; 


ary for a *good' drawing, in 
equalamount. Apparently, special Strength in 


a certain positive quality compensates to some 
extent, for lack of strength in another positive 
quality. 


Often an artist will describe his style of draw~ 


v à 


ex 


December 1953) LOOMER ^ 


graphically. 

People differ in the extent to which they make 
use of sensory data, are able to 'assimilate it, 
and are able to select and use it wisely in a draw- 
ing. Because experience is such a unique and 
individual thing people must be allowed to create 
in the way that is best for them. 


ing as having a cartoon quality, or an illustration" 
alquality, or some other descriptive term that, 
to him, sums up the impression he believes is 
created by his drawing. Sometimes the artist is 
Strongly aware of his particular style of drawing, 
but sometimes he is not, or perhaps finds it dif- 
ficult to describe verbally what he can express 


B. Catalog of Selected Drawings 


(See following pages.) 


yore ae SS 4.1 «tV. hd oT) 


ar 
eee M $ 
‘ OP P E; 1. 1-254 
s= -- “ " P roni 7 ex oat P. M 
΄ Ld - 1 : NC 
AI ciao lio o 
d orng ral d - M omnes —, η 
; P au. ~ Pn m K^ 
: 3 αν . > W^ 


ee 
" " Unis s bun P^ 
-.. ams pone ^ 
πο πε νην ; Vaterin, Mewes. t 
δ. κ > ied 4. ΄ D$ “e <», 
TA EPA Po Metu IT 
; . bey ὁ, > sester 
- ος αμ a e ρω” > T 
ο. ερ (48 01, ano wh κώλο . e 
S * "noct "t M wert - 


* "A 
ας 4 c 
* « κι 
, 4) 


í 
HUE 


t 


LL 
Ser pM e 


T EEPE ρα ο, 
> Z3 F À 
> - Dd Ge om da ey 
- emamt y 


sebo δὲν . PAOWA 
7 oor eMC EN Boy [. o SLS ER 
rd 7 7 


nil 
Pun t9 


uL VUA 


ap e - c 


sæ = mmo oo =e 


— 


d 


Bel Ps Le Te Fe TIT 


Fourit 
Ler ine 


diy 


ὅτε 


(UN 


npe— — 
TRECE 


——— 


----------------- 


i ΠΠ. 


LI 


LION CUB 


695 


— ls d ——À 


“τετ — —— 


" — - ———— f 

——1 ------ —— md ge p 
i | è | fı | ] «1 

$ EN DS ANS X Gu 
-- δ ae = z i p - | 

= - "ο. 

πας t a aa is exam bie ol j , 


κ 
9 
ki 
τ 
Q 
E 
A 
E 
H 
e 
3 
č 


December 1953) 


SECTION V 


SUMMARY OF THE INVESTIGATION 


A. Summary and Conclusions 


r gue EXPERIMENTAL data growing out 

at his investigation, and the conclusions result” 

: & therefrom, will be summarized in four gen~- 
ral areas as follows: 


1. Group performance for the three different 
3 art experiences 

- The reliability of individual performance 
3 in various groups 
H The evaluation of drawings 

- The contribution of student opinion 


=i Group Performance for the Three Groups. 
iffe e study of the relative effects of the three 
ρα art experiences, as reflected in the 
art eod by participants, has revealed that some 
he Xperiences are more valuable than others for 
e producfion of drawings with merit. In one of 
the a. ee comparisons made between groups with 
was alysis of variance technique the difference 
at ho ee enough to be statistically significant 
directii 05 level of probability. The extent and 
Worki ion of this difference was discovered by 
grou ng out the critical ratios between the three 
vant In one instance the Z group had an ad- 
nificant over the Y group with a critical ratio sig- 
gr nt at the 01 level. In another instance the 
aues n had an advantage over the Y group with 
advantace ratio significant at the . 05 level. The 
Not la ge that Z group had over the X group was 
ithe rge enough to be statistically significant at 
Were ; the.05 or the .01 level. These findings 
Com amplified when 12 ‘‘Students T?” scores 
tiene Ting group performance ΟΠ various action | 
Stang S gave an advantage to Z group in every in 
Erou © where it was compared to X group, an 
was b fave an advantage to X group wherever it 
insta Ompared to Y group; and in some of these - 
tically ei by an amount large enough to be statis 
"d Significant. 
in vend the statistical findings favor the Z group 
other FX instance where it is compared with the 
Compa 9. In every instance where X and Y are 
pared, the X group is superior. 
in và The Reliability of Individual Performance 
f rious Groups. —An idea of the reliability 
Sc € performance by an individual of one group 
may pe pared with that of an individual of another 
ewi had by an examination of i 
most ; dividuals whose drawings fluctuate the 
ny the ratings given by the five judges were 
the len for study, Without special attention to , 
tha E el of the ratings, if the amount is totale 
individual varies from the mean of his 


iv : 
€ ratings, a general idea may be had of where 


ndividual scores. 


LOOMER 101 


the greatest variation exists. Following this 
plan, an examination will reveal individuals in 
the Z group as the most stable, those in the Y 
group the next most stable, and those in the X 
group the least stable. 

Where coefficients of correlation were com- 
puted on two evaluations of a single action theme 
of 60 drawings, the Z group had the highest with 
.91, the Y group next with . 90 and the X group 
somewhat lower with . 76. The X group thus ap- 
pears to be much less stable than groups Y and 
2. 

3. The Evaluation of Drawings. —Strict con- 
formity of agreement is neither necessary nor 
desirable in the evaluation of drawings, espec- 
ially since each judge relied upon his own per- 
sonal criteria for the evaluation. In instances 
where judges agree to a large extent on the draw- 
ings of a particular action theme, there is no 
reason to believe that the same judges will agree 
to the same extent upon the drawings of a differ- 
ent action theme. Judges agree more with them- 
selves than with each other, as might be expected. 
Judges have the least agreement on the subject 
matter of the ‘animal’ theme, to a greater extent 
on the ‘machine’ theme, and agree to the greatest 
extent on the ‘people’ theme. 

A method of increasing the reliability of eval- 
uation is possible where the judges are asked to 
make two evaluations of the same material, with 
allowance of a time lapse between evaluations. 
Correlations between two such evaluations of one 
judge will reveal which judges are the most reli- 
able. Judges can then be divided into two groups: 
the ‘most’ reliable, and the ‘least’ reliable. The 
coefficient of correlation calculated between the 
two groups will reveal the extent of agreement, 
and where the correlation is too low, it may be 
desirable to use only evaluations of the ‘most’ 
reliable group. 

4. The Contribution of Student Opinion, —A 
summary of student opinion acquired in connec- 
tion with an investigation can be valuable in sup- 
plying information to supplement findings of a 
statistical nature, and can help to give an all 
round description of what the students are think- 
ing, their interests, problem areas, and other 
things that aid in development of art experiences 
that are vital and interesting. Student opinion 
was sought concerning their own drawing in these 
areas; On which action theme do you think you 
made the best drawing? Is 45 minutes adequate 
time in which to make a satisfactory drawing? 
How would you describe your own particular 'style' 
or manner of drawing? 

Most people thought that they made their best 
drawing on theme 3, the ‘people’ theme, The 45 
minutes was considered by most people to be ad- 
equate. Some found it difficult to describe their 
own style, but the remarks in response indicated 
that most people were aware of their style being 


109 JOURNAL OF EXPERIMENTAL EDUCATION 


ina 'stage' of developing. 
B. Problems Suggested by This Investigation 


Since the symbols of expression used inart 
are often very different than those used in speech, 
writing, and the use of numbers, the researchin 
Art Education must surmount the fact that the 
non-verbal nature of expression increases the 
difficulties encountered in description and meas- 
urement. Traditional practices in art are inpart 
responsible for some of the general problems 
faced by the investigator; examples of these are 
the tendency to set up a dichotomy of those who 
are 'good' in art and those who are not, the ten- 
dency to judge art from an adult standard, too 
much selection of students in the past, without 
adequate provision for the needs of all the stud- 
ents, and many others. Since the problems ap- 


pear most naturally in question form, they are 
stated here in that form: 


1. How can we relate art experiences more di- 
rectly to the needs of everyday life? 
Following the identification of needs of the 
Students, how can we be sure that these are 
‘felt needs’? 

. How can we develop in the student capacity and 
desire for more continuous self-education to 
the end that each experience is meaningfuland 
worthwhile ? 

- To what extent can we plan art experiences 
around the current interests of the students? 

- Ina situation where art experiences may be 
compared, what art experiences are of the 
most value for specific groups and for the in- 
dividuals that make up these groups? 


. Implications of This Investigation for Educa- 


tion 


2. 


To achieve the full potential of which we, as 
human beings are capable, and to fulfill our duty 
and obligation as citizens and leaders in a demo- 
cratic society, we must know more about the re- 
lation of Art to human behavior in actual learn- 
ing situations. The following appear to be im- 
portant implications and ways of application of 
these findings, whether for the construction of a 


(Vol. 22 


generalart course, or for education in general: 


1. Exposure to a variety of problem situations 
involving many and varied materials and methods 
of expression in the field of Art, allows the stud- 
ent to become acquainted with basic facts and 
principles important in the creative proces 5. 
Through these problem situations the student is 
led to develop a personal system of values that 
constitute his personal art judgment, and help 
him to become an intelligent consumer, critic and 
creator of art products. 


2. Art may be fostered under any number of 
stimulating conditions, and it is impossible to 
say that procedure A is better than procedure B 
for all situations, or for all people. However, 
in any situation, the teacher should be free to 
diseuss issues with the students, make sugges- 
tions, point out alternate courses of action, guide 
and help each individual student in every way pos^ 
sible. 

3. Because of the diversity and individuality 
of experience, there is no one appealing type of 
subject matter. Interests are as varied and nu- 
merous as the number of students in the class, 
so that offerings must be individualized to give 
purpose, stimulate effort, and give meaning to 
the content. 

4. Various attempts must be made to adjust 
the learning experience to the way individual 
Students learn. For example, some learn best 
through lecture and demonstration methods where 
the emphasis is on verbal learning; others, hand- 
icapped there, learn best ina laboratory situa- 
tion where the actual manipulation allows more 
motor coordination in the learning process; while 
others may have to depend more upon a spacial 
expression in three dimensional media. 


5. Development must be expected and encour- 
aged on all levels, since experience is a contin- 
uous, cumulative thing. If we recognize this 
fact, at no point in the development process will 
the child be judged by adult standards. 

6. If we tie our art experiences, as we must, 
to contemporary life, it is necessary that we go 
beyond the narrow limits of the classroom, to 
make eurricular and co-curricular experience 
more meaningful and more functional for all. 


A STUDY OF FOURTH GRADE CHILDREN'S 
COMPREHENSION OF CERTAIN VERBAL 
ABSTRACTIONS 


MARY C. SERRA* 
Temple University 
Philadelphia, Pennsylvania 


THE PROBLEM 


Statement of the Problem 


T CHIEF purpose of this study Was to 

com igate the ability of fourth-grade pupils to 

μη different verbal abstractions as 

grade ied in vocabulary common to primary 

lowi S. Specific data were obtained on the fol- 
ing questions: 


it, 
= relationship exists between comprehen- 
^q of verbal abstractions and background 
information? 


a. between ability to classify ideas and back- 
ground of information? 
- between ability to index ideas and back- 
ground of information? 


omprehen- 


* What relationship exists between ο 
nce? 


Sion of verbal abstractions and intellige 


nd intelligence ? 


a. between classifyi ili 
ing ability a 
esi intelligence ? 


b. between indexing ability and 


What relationship exists between classifying 
nd indexing ability? 


παμε, 
3Stification of the Stud 


The justification for this study may be estab- 


li 
AShed as follows: 


Verbal abstractions by employing them in 


ΠΝ Ti l : 
tain To investigate the comprehension of cer 
to classify and 


index a Ened to measure ability 


the ability to 
h intelligence- 


2. 


clas To determine to what extent 


Si : E 
ify and index ideas varies wit! 


Limitations of this Study 


This study is concerned with the ability of a 
group of fourth-grade subjects to classify and in- 
dex. To determine the extent to which subjects 
of this age group possess this ability, an Inven- 
tory on Background of Information, a test on Class- 
ifying Ideas, and a test on Indexing Ideas were ad- 
ministered. The verbal intelligence of the popu- 
lation of this study was determined by means of 
the Stanford-Binet Test of Intelligence, Form L, 


1937 Revision. 1 
The Population. — Three populations were used 
in this study: 


1. Population used for the experimental inventory 
on background of information. One hundred 
subjects were used; twenty-five cases in each 
of four grade levels. 


a. Sex. Both sexes were used in this study. 

b. Age. The chronological age range was 

from 6 to 10 years, inclusive. 

School Experiences. The cases selected 

for the study were obtained from grades one 

to four, inclusive. 

d. Geographical Area. The subjects were se- 
lected from Garrettford School in Drexel 


Hill, Pennsylvania. 


c. 


2. Population used for the experimental tests on 
classifying and indexing ideas. Three hundred 
subjects were used; one hundred cases in each 


of three grade levels. 


a. Sex. Both sexes were used. 

b. Age. The chronological age range was from 
8 to 11 years, inclusive. 

c. School Experiences. The subjects selected 
for the study were obtained from grades 
three to four, inclusive. 


P 
Tesent address: Illinois State Normal Univers 


i. 
Published by Houghton Mifflin Co., Bostons 


ity, 


Normal, Illinois. 


Massachusetts. 


104 


d. Geographical Area. The subjects were 
Selected from Garrettford School in Drex- 
el Hill, Pennsylvania. 


3. Population used for the final survey. One 
hundred fourth grade subjects were used. 


a. Sex. Both sexes were used. 

b. Age. The chronological age range was 
from 9 to 10 years, inclusive. 

ο. Intelligence. The intelligence quotient 
range was from 72 to 151. 

d. Geographical Area. The subjects for the 
final survey were from nine fourth grades 
in Ridley Township School District, Penn- 
sylvania. 


Abstractions Used in this Study. —Three sets 
of primary basic readers anda set of science 


books were analyzed for 246 verbal abstractions: 


(1) Easy Growth in Reading Series, 2 (2) Curric- 
ulum Foundation Series, 3 (3) Betts Basic Read- 
ers, 4 and (4) The Wonderworld of Science. 

Tests Used in this Study. —One standardized 
test and three instruments that were construct- 
ed for this study were used. The Stanford-Binet 
Test of Intelligence, Form L, 1937 Revision, 
was the standardized test used to determine the 
verbal intelligence of the population. The other 
three instruments were: (1) an inventory on 
background of information, (2) a test on class- 
ifying ideas, and (3) a test on indexing ideas. 

Statistical Procedures. — The subjects were 
given a series of tests to measure various abil- 
ities. To determine whether the instruments 
used to measure certain abilities were signifi- 
cant, two statistical techniques were employed: 


1. The standard error of the difference be- 
tween proportions was used to determine 
Significance of items. 

.2. The reliability of the tests was establish- 
ed by using the split-half method and the 
Spearman-Brown **Prophecy-Formula, ” 


The results obtained from the tests were 
compared by employing the Pearson Product 
Moment Formula. This technique was used be- 
cause it provided a way of determining the re- 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. 22 


lationship existing among various abilities. 


Terminology 


1. 


- “Indexing”? 


‘Verbal Abstraction” is used synonymously 
with verbalconcept. For example, the writ- 
ten symbol dog, representing a special animal, 
is a verbalabstraction or concept. Verbal 
concepts go up and down the ladder of abstrac- 
tion. In order of increasing abstraction, more 
characteristics are omitted: 


2. The descriptive or naming levelis the low- 
est verbal level of abstraction. For example, 
Skippy, the name of a dog, is an abstrac- 
tion at the lowest verbal level; it is still an 
abstraction since some characteristics of 
the real Skippy have been omitted. 

b. The abstraction dog is at a higher level than 
Skippy since more characteristics have been 
omitted. Dog may not mean a special dog, 
but any dog at all. 

€. The verbal abstraction animal is at a higher 
levelthan Skippy or dog. Still more char- 
acteristics have been omitted in the abstrac- 
tion animal than in dog. Animal is a more 
general term than dog or Skippy. 


- “Category” is the term used to indicate a broad 


division or group in which persons, places, ob- 
jects, or events having common characteristics 
can be placed. For example, animal is the 
group name or broad division to which dogs, 
cats, horses, etc., belong. 


“Classifying” is the ability to put persons, 
places, objects, or events with similar char- 
acteristics or logical relationships. in commonly 
accepted groups or categories. For example, 
man, dog, and horse have similar characteris- 
tics and can be placed in the animal category. 


**Concept! 
iently indiv 
readily use 


is defined as a **meaning suffic- 
idualized to be directly grasped and 
d, and thus fixed by a word. "6 


is the ability to name specific per- 
8, objects, or events that have been 
Specialcategory by society, In the 


Sons, place 
placed in a 


2. Gertrude Hildreth. Easy Growth in Reading Series (Philadelphia: John C. Winston Co 190) 
ey « 


3. William 8. Gray and May H. Arbuthnot. 
1941). 


l. Emmett A. Betts and Carolyn M. Welsh. 


Warren Know, George Stone, and Doris Noble. 
Sons, 19|6). 


Curriculum Foundation Series (New York: 
Curriculum Foundation Series 1 


The Wonderworld of Science (New York: 
-22 Wonderworld of Science : 


Scott, Foresman Coe, 


Betts Basic Readers (New York: American Book Co., 1918). 


C. Scribner!s 


6. John Dewey. How We Think (Boston: Heath and Co., 1910), p. 125. 


December 1953) 


ο. category, the naming of a special type 
animal, such as rat, would be indexing. 


Α REVIEW OF KINDRED LITERATURE 


in eria are a large number of studies reported 
of the b erature which bear upon different aspects 
Cone ies problem of verbal abstractions or 
cosh, as they are represented by words. Al- 
with none research was found dealing directly 
study pri d and indexing as defined in this 
section ον literature reported in this 
Standin will provide a background for an under- 
are vee of the problem. The studies reviewed 
of the erred to under the following three aspects 
En a a λα (1) how to develop concepts and 
di tone rbal representations, (2) measurement 
tional epts, and (3) concept burden of instruc- 
materials. 


Ho 
Re to Develop Concepts and Their Verbal 
“presentations 


μὴ erous experimenters investigating the 
best ima were interested in determining 
ing pri Ching procedures rather than establish- 
ences i. iples of how to develop concepts. Infer- 
develop n from research into the way concepts 
deyel 229 from the factors that influence their 
Direct d provide the needed principles. 
reviews Experience. — Findings from research 
ment me On factors influencing concept develop" 
on whic icate that the more direct the experience 
the hy e concept is built, the greater will be 
that vidual’s knowledge and understanding of 
Concept, 
build Tect experience is necessary in order to 
any or oncepts, It is impossible, however, for 
ne to develop all the concepts needed in 


moq ς e 

ilone life on the basis of direct experience 

* Vicarious experience must be utilized 
ough the med- 


and 
iun, MUCH of it will be received thr 
mbols are ad- 


itis 


ui Sis of direct experience. 
κα concepts, then, tm necessary t 
Cepte th ED rder to establish the simple ο 
Sequenti t will be combined and manipulated Su 
Once y to form the more complex concepts. 
la n that can be traced back only to verbal 
ult da or symbols acquired through language 
Mar n mere verbalism. 
lished ρω ct Bedwell’s (1) investi 
T fact, act that it is possible to have Vero", 
Conce "us knowledge without having functiona 
. Practical experience, as She points 
the most important factor in concept de~ 
Sister M. B. Herbers (15) inves~ 
zd the comprehension difficulties of 30 
$rad? pupils. This study indicated that 


ry to provide 
mple con^ 


Tes 
gation estab- 


SERRA 


105 


concrete materials and personal experiences are 
necessary to overcome verbalism. Osburn, Hunt- 
ington, and Weeks (27) experimented with vicar- 
ious experience between the levels of direct ex- 
perience and of verbalism. Their findings were 
that experiences with pictures, illustrations, mod- 
els, and the like should be used to make concrete 
the relationships implied in language. 

H. J. Sachs (30) makes the point that children 
do not acquire concepts by merely meeting words 
in context. He worked with 416 freshmen to de- 
termine the extent of their meaning vocabulary. 
His study shows the limitations of the reading 
method of improving vocabulary. The work of 
Ruth L. Sims (34) emphasizes that the role of ex- 
perience is of utmost importance in the develop- 
ment of concepts. Helen B. Stolte (38), working 
with geographical concepts, draws the same con- 
clusions. M. Theriesa Wiederfeld (43) in the field 
of history, concludes that building concepts neces- 
sitates providing experiences. 

Vocabulary Development. —Much research de- 
voted to determining the most effective means of 
increasing vocabulary assumes that enlargement 
of vocabulary is a virtue in itself without question- 
ing the dimensions of the concepts with which 
words are associated. In that direction lies verb- 
alism. Inthe mass, however, this researchdem- 
onstrates that vocabulary is increased by exper- 


iences of two kinds: 


rience with the raw materials of the 
concepts for which given words are sym- 
bols; that is, experiences with objects and 
rocesses and with lower level concepts on 
which the required concepts are built. 


1. Expe: 


2. Experience with the given word itself; that 
is, hearing the word, speaking the word; 
and reading and writing it. 


Experience with the raw materials of concepts 
develops the concept; experience with the word 


associates word and concept. 

Dunkel’s (9) conclusion that the ability to deter- 
mine the precise meaning of a word is related to 
the ability to read with comprehension gives the 
clue to the reason for the great quantity of inves- 
tigations of word meaning and vocabulary develop- 
ment. He constructed a new type of vocabulary 
test. He was dissatisfied with the conventional 
type of test which assumes a ‘‘core-meaning”’ 
or «sphere of meaning" for each word. These 
tests, he felt, did nothing to test the student's in- 
terpretation of the precise meaning taken on by a 
word used in à particular situation. In Dunkel's 
test, words were used in a paragraph and then in 
five sentences. Subjects were to mark the item 
which had the same meaning as in the paragraph. 
This test was administered to subjects in the 
tenth, thirteenth, and fourteenth grades. He con- 


106 JOURNAL OF EXPERIMENTAL EDUCATION 


cluded that: 


1. The ability to determine the precise mean- 
ing of a word is related to the ability to 
read with comprehension. 


2. Education and maturity lead to develop- 


ment of the ability to determine the pre- 
cise meaning of a word. 


Ralph Haefner (13), Harry K. Newburn (25), 
Glenda Liddell (18), H. J. Sachs (30), William 
S. Gray and Eleanor Holmes (11) are a few of 
the investigators concluding that it seemed pos- 
Sible to improve the vocabulary of a group of 
children and adults by merely exposing them for 
a few minutes each day to a new word, Vocabu- 
lary is acquired by casual learning and by form- 


Says, ‘‘Words do not mean what they say. But 
through Semantics, another factor influencing 
Concept development is recognized, i, e., the 
multiple~meaning of words, 

The real Significance of multiple-meaning 
arises from the fact that we receive so many of 
our concepts through stimuli in theform of words, 
Spoken or written. No one can have all the ex- 
periences directly that are basic to understand- 
ing modern life, Vicarious experiences must be 
resorted to, and in practice much of it turns out 
to be in the form of spoken or written language, 
For the reception of language, there must be 
agreement upon the meanings of words, When 


word and the concept he forms may be totally at 
fault, 

One means of developing concepts is through 
extending vocabularies. The investigations of 
Blair (2), Curoe (7,8), Gray and Holmes (11), 
Haefner (13), Liddell (18), Newburn (25), Sachs 
(30), Sanderson (31), Shannon and Kittle (33), 
Tate (39), Traxler (40), Waring (41), andWaters 
(42), all point out that direct study of words and 
their meanings is productive in extending vocab- 
ulary. 


The Measurement of Concepts 


In nearly all research in the field of concepts, 
investigators have experienced difficulty intheir 
efforts to measure concepts. In practically all 
the studies in this area, measurement of Some 


(Vol. 22 


type was required. Attempts have been made to 
evaluate concepts by a number of devices. 

Several types of testing can be adopted to meas- 
ure the existence of a given concept. L. J. Cron- 
bach (6), Ruth Looby (20), Hyman Meltzer (22), 
and Helen Stolte (38) found that the oral interview 
technique was a Superior technique in measuring 
concept development. 

Attempts have been made to measure concepts 
by causing the subjects to perform appropriate 
acts as an index of their presence. Margaret 
Bedwell (1), Cofer and Smith (5), Hazlitt (14), 
Long and Welch (19), Mott (23), Sims (35), Rash- 
kis, Cushman, and Landis (28), and Chard and 
Swartz (4) employed various performance tech- 
niques to measure Concepts in their research. Ap- 
propriate acts to demonstrate the concept of an ob- 
ject, relationship, or a principle is one type of 
performance test. 
or objects to demonstrate the understanding of 
certain objects is a second kind of performance 
test. Neither type appeared to measure accurately 
the acquisition of concepts. 

Many experimenters have used multiple-choice 
tests to measure knowledge of word meaning. V.H. 
Kelley (17) concluded that the five types of objec- 
tive word meaning tests he used in his investigation 
did not possess Sufficient validity to make it valu- 
able as an instrument for measuring the total word 
meaning knowledge of children. Cronbach (6), Dun- 
kel (9), Eskridge (10), Serenius (32), andStalnaker 
(37) concluded that Selected distractors should be 
used in multiple choice tests. 

Murphy (24) added free association tests to his 
investigation. He worked with college freshmen. 
He found association tests to be a device useful 
in Supplying an index to Concepts. Buswell and 
John (3) investigated the vocabulary of arithmetic. 
They found that the ability to make an association 
of an appropriate word with a concept cannot be 
taken as evidence of the extent of a concept. 


There is agreement among experimenters that 
regardless of the 


ize a concept in one Situation is not an index that 
it can be utilized in other Situations, 


Concept Burden of Instructional Material 


iences and mere verbalism, 
Some of the Studies in this area have little or 


Sorting stimuli, words, phrases, 


December 1953) 


no individual statistical significance and their 
Specific findings are open to question. Theagree- 
ment among them, however, cannot be ignored. 


Social Studies 


Bedwell (1) studied children's comprehension 
of concepts of quantity in third grade social stud- 
ies reading materials. Both definite and indef- 
inite terms were found to be of differing degrees 
of difficulty. Children demonstrated that it was 
Possible to have a factual knowledge of a term 
without having a functional concept of the same 
term. Even in the restricted field of concepts 
of quantity at the third reader level, the load 
Was too heavy for the children and verbalism re- 
Sulted, Definite and indefinite terms for quant- 
ity were misinterpreted because children lacked 
Concepts based on experience. This makes it 
Clear that concept burden is relative, to be eval- 
uated only in terms of the established concepts 
9f intended readers. 

Ritter (29) very carefully studied the words 
and meanings (and by inference, the concepts) 
used in two popular fourth grade geography text- 
books, Her findings are glaring when contrast- 
€d to the usual vocabulary burden of basal read- 
ing Series, In the more recent basal reading 

Ooks for grade four, the authors, who can be 
Presumed to have given careful consideration to 
ildren's capacity to acquire new vocabulary 
effectively, introduced roughly 1000 new words. 
Ritter found that 2195 technical, difficult, or un 

usual terms are introduced in one fourth grade 
Seography, In the basal reading instruction 
nothing is more important than the building of 
reading vocabulary on the basis of well-estab- 


lished Concepts. In geography texts, vocabulary 
15 relatively incidental. One author of ἃ geog 
i text doubles the burden the authors of 
aders feel children can carry. - 
η. Springman (36) in a study of sixth grade ed 
ils? understanding of statements in social stu 
155 textbooks found that only about half of the 
Children fully comprehended the statements. He 
Concluded that the concepts were much too heavy 
2 load for the children he tested and it can be 
poplied that it would constitute excessive burden 


9r many children. 


Basal Reading Materials 


Marcum (21) made a direct attack upon HP 
concept burden of primary basal reading mater 
ls. She analyzed the concept burden of vob 
ary basal reading material. She analyzed 
pOnCepts expressed in fifteen series of pre 
Primers, primers, first, and second gan " 
cs den S, and found the following total numbe 

Concepts at the indicated levels: 


SERRA 


107 


1. 110 different concepts in 15 preprimers 

2. 406 different concepts in 15 primers 

3. 719 different concepts in 15 first readers 

4. 1487 different Concepts in 15 second readers 


At any one level, not more than 17 concepts were 
common to all series, only one concept was com- 
mon to all preprimers. 

Sims (34) concluded that the concepts which a 
child will need for handling materials adequately 
through reading can be determined by analysis. 
She recommended the compilation and use of ‘‘con- 
cept vocabularies” or ‘‘concept lists, ” 

Ogle (26) found that the concept burden varied 
from one series of basal readers to another at the 
primary level. He classified the concepts and 
found twice as many relating to house and homeas 
in any other classification, with concepts of met- 
als occurring least often. 

As of 1938, Hockett (16) reported that the basic 
vocabulary of “recent” primers and first readers 
was not as extensive as in older books. There was 
more repetition of words in '/recent" readers than 
in earlier editions. There is no evidence that this 
trend has been reversed. Fewer words and more 
experience with them is the major promise of the 
authors of basalreaders. The same approach 
could well be assigned to the treatment of concepts. 

Gunderson (12) found some hopeful signs in 19- 
42. She concluded that: 


1. Basal readers for the first three grades 
make provision for the development of mean- 
ing vocabularies in verbs. . 

2. The large number of synonyms introduced at 
progressive levels enabled children to be- 
come aware of the common as well as the 
more specific meanings of words. 

3. The vocabulary load increases from the pre- 
primer to the third reader for growth in ac~ 
curacy of comprehension and depth of inter- 
pretation. . . 

4. More colorful and precise words are intro- 
duced in the second and third readers for 
finer interpretation, developing shades of 
feeling, and a sensitivity for meaning. 


Gunderson's study dealt with words and their 
meanings rather than with concepts. She analyzed 
ten sets of readers. Her study shows the serious 
attention given by authors of basal readers to the 
nature of learning in children. 

The real harm in overloading instructional ma- 
terials with concepts is that overloading produces 
verbalism (or no learning at all). Looby (20)found 
in investigating the understandings children in the 
sixth grade derive from their reading of literature 
that verbalism is rampant, Children's understand- 
ing of ‘‘the stern of the ship" placed the stern 
from one end of the boat to the other. ‘‘Breathing 


108 JOURNAL OF EXPERIMENTAL EDUCATION 


ing space’’ meant a space in the mouth. A man 
who was *'slain" meant that he had servants. 
Yet children read and used these words. 

In books overloaded with difficult terms, a 
large portion of these terms are not associated 
in any usable manner with established concepts. 
In some cases, children can ascribe no meaning 
to the terms; in other cases they ascribe wrong 
meanings, vague meanings, or partially correct 
meanings. Inany case, where correct meaning 
is not ascribed, the best that can result from 
reading the term is verbalism. 


PROCEDURES 


This study was chiefly concerned with evalu- 
ating certain abilities involved in comprehension 
of concepts at different levels of complexity. To 
measure relationships existing among these abil- 
ities, two main procedures were followed: con- 
Struction of instruments, and a survey employ- 
ing them, 

Before it was possible to conduct the survey, 
three tests needed to be constructed: (1) Back- 
£round of information inventory, (2) tests on 
classifying ideas, and (3) tests on indexing ideas, 
For the background of. information inventory, 
primary reading materials, currently used, 
al and science con- 


(Vol. 22 


8. Two abstractions were used in the first 
part of the analogy. The first abstraction 
was a lower level abstraction than the sec- 


ond one; e. g., Sally is to girl. Sally isa 
lower level verbal abstraction than is girl. 


b. One abstraction was used in the second part 
of the analogy. It was coordinate with the 
first abstraction used in the analogy; e.g., 
as Bob is to. Bob is at the same level of 
abstraction as Sally. Bob and Sally are co- 
ordinate levels of verbal abstractions. 


C. Three choices were given in each analogy. 
The choices were at three different levels 
of abstractions; e. g., brother, Billy, boy. 
1) Brother is a higher level verbal abstrac- 

tion than is Billy but lower than boy. 
2) Boy is a higher level verbal abstraction 
than brother or Billy. 


3) Billy is the abstraction at the lowest 
‘level, ” 


The choice representing the best answer 
was coordinate with the second abstraction 
in the first part of the analogy. Boy is co- 
ordinate with girl the second verbal ab- 
Straction in the first part of the analogy. 


The abstraction representing the best ans- 
Wer was alternated; it was not in the same 


position for more than two consecutive an- 
alogies, 


The experimental classifying test of 56 anal- 
6 ν ogies was administered to a population at three 
vidually to a population at four different grade ade 1 : j i 
levels: 25 Subjects in each of grades one dae ue midre ο <i μια 


two rad i = 
was and four. | The level of difficulty of items’ | 5.4555 three, four, and five. ‘The Z-score of 


(1) each item grade subjects was 
St 88 percent error of the difference between 


Split-Half Method and Spearman-Brown **«Prophecy- 
Formula? Were employed to estimate a coefficient 
of reliability of , 94, corrected , 97, 

The fifty items included in the final test on 
Classifying met the following criteria: 


grades. 


A final inventory of 100 items Was construct- 
ed based on the r i 
ventory. Reliability was obtained on the hun- 1. Combined data of each item between grades 


ng Z-Score of at least 3.13, 
ne percent level, Z-Score 
Standard Error of the Dif- 
TOportions; in this case, 


d bottom one hundred. 
Preliminary and final tests on classifying of 2. Each item o; 
ideas were constructed, Concepts common to subjects ha de test taken by fourth grade 


! a Z-Score of at t . 40. 

Fifty items selecteg in order s pie a mean 

Z-Score of 2. 00, Significant at the 5 percent 

level. Z-Score Was determined by the Stand- 

ard Error of the Di €rence Between Propor- 
τ 


December 1953) 


tions; in this case, using the top fifty and 
bottom fifty. — ο — ^ ipia 


3. Minimum increase of eighteen percent cor- 
rect responses for each item from the third 
to the fourth grade level. 


e 


Percent of correct responses for each item 
in the fifth grade tests the same or higher 
than in the fourth grade. 


The items selected were arranged in order 
of increasing difficulty as indicated by fourth 
&rade data. 

Experimental and final tests on indexing ideas 
Wer e constructed following the same procedures 
às in the classifying tests. The following cri- 
teria were followed in making the analogies for 
the indexing test: 


a. Two abstractions were used in the first part 
Of the analogy. The first abstraction was at 
a higher level abstraction than the second 
One, e.g., Bird is to robin. Bird is a higher 
level verbal abstraction than robin. 


b. One abstraction was used in the second part 
of the analogy. It was coordinate with the 
first abstraction used in the analogy, e. 8.; 
35 insect is to. Insect and bird are coordin- 
ate levels of verbal abstractions. 


9. Three choices were given in each analogy. 

The choices were at different levels of ab- 

t acHon, e.g., grasshopper, animal and 
e. 


1) Grasshopper is a lower level abstraction 
9 n is animal or life. . 
) Animal is a higher level abstraction than 
grasshopper but a lower level than life. 
Life is a higher level abstraction than 


rasshopper or animal. 


d. The choice representing the best answer Was 
Coordinate with the second abstraction in the 


first part of the analogy, e.g., grasshopper 
is coordinate with robin. Grasshopper is the 
est answer, 


The abstraction representing the best answer 
bd alternated; it was not in the same post" 
lon for more than two consecutive analogies. 


tast he Same population was used in the indexing 

each it in the classifying test. The Z-Score of 

grad item in the tests taken by third and fourth 

er ap Subjects was determined by the standard 
e ἃ of the difference between proportions. 

εν SPlit-Half Method and Spearman-B r 0o Wn 

Tophecy-Formula”’ estimated the coefficient 


ο: 1 ἈΚ ἡ 
f reliability to be .88, correct . 88. 


SERRA 


109 


The fity items that were included in the final 
test on indexing ideas met the following criteria: 


1. Combined data of each item between grades 
three and four giving a Z-Score of at least 3.00 
significant at the one percent level. Z-Score : 
was determined by the standard error of the 
difference between proportions; in this case 7 
top one hundred and bottom one hundred. — 


2. Each item on the test taken by fourth grade sub- 
jects having a Z-Score of at least .40. Fifty 
items selected in order to have a mean Z-Score 
of 2.02, significant at the 5 percent level. Z- 
Score was determined by the Standard Error 
of the Difference Between Proportions; in this 
case, using the top and bottom fifty, 


3. Minimum increase of eighteen percent of cor- 
rect responses for each item from the third to 
the fourth grade level. 

a) Fourth grade subjects. All ninety-four sub- 
jects in three fourth grade rooms were test- 
ed, Six additional fourth grade pupils were 
chosen at random from another fourth grade 
class. 

b) Fifth-grade subjects. In three fifth grade 
classes there were ninety~seven pupils. 
Three pupils from another fifth grade class 
were added. 


After the preliminary steps of constructing and 
standardizing instruments were completed, the 
final survey was made. The three instruments 
were administered to one hundred fourth grade 
subjects. The subjects had been administered 
the Stanford-Binet Test of Intelligence, Form L, 
1937 Revision, during their first year of school. 
The purpose of the survey was to investigate re- 
lationships that exist among the comprehension 
of verbal abstractions, background of information, 
and intelligence. The Pearson Product Moment 
Coefficient of correlation was employed to deter- 
mine relationships among abilities. 


SUMMARY OF RESULTS 


The experimental Inventory on Background of 
Information was administered to subjects at four 
grade levels, to determine the discriminative 
value of its items. A summary of data is pre- 
sented in Table I. 

The combined data on the experimental Inven- 
tory on Background of Information taken by 100 
subjects in four grade levels are summarized in 
Table Il. These data are concerned with item 
difficulty for the total population, 


Experimental Test on Classifying Ideas 


The data from the experimental test on classi- 
fying ideas will be presented in two categories: 


JOURNAL OF EXPERIMENTAL EDUCATION 


EXPERIMENTAL DATA INVENT 


TABLE I 


TION: PERCENT OF POPULATION WITH 


BY GRADE LEVELS 


ORY ON BACKGROUND OF INFORMA- 


CORRECT ITEMS 


(25 Subjects at Each Grade Level) 


Grade IV 
No. of 
Correct 


36 
31 
16 
52 


m 
Pen RA 


N 
e 
e 


-1 


Grade I Grade II Grade III 
No. of No. of No. of 
Percent of Correct Correct Correct 
Population Items Items Items Items 

100 1 4 
96 0 6 
92 2 8 
88 3 6 
84 1 2 8 
80 1 3 16 
76 0 5 5 
72 0 8 6 
68 1 6 9 
64 0 τ 1 
60 2 10 7 
56 3 4 6 
52 5 5 6 
48 4 6 17 
44 2 4 9 
40 10 9 12 
36 5 1L 8 
32 8 10 9 
28 9 11 9 
24 5 10 9 
20 8 19 10 
16 19 10 8 
12 10 19 17 
8 27 19 9 
4 -21 -18 15 
141 195 219 

No. Items 
Failed 105 51 27 

Total No. 
of Items 246 246 246 


246 


(Vol. 22 


December 1953) 


SERRA 


TABLE II 


EXPERIMENTAL DATA ON INVENTORY OF 
BACK- 
GROUND OF INFORMATION: AVERAGE EA 
CENT OF TOTAL POPULATION WITH 
CORRECT ITEMS 


(25 Subjects in Each of Grades I, II, III, and IV) 


Average Percent of Number of Correct 
Total Population Items 
80 - 89 8 
70 - 79 21 
60 - 69 25 
50 - 59 21 
40 - 49 36 
30 - 39 35 
20 - 29 25 
10 - 19 36 
1-8 .26 
239 
No. of Items Failed | 
Total No. of Items 246 
TABLE III 
SUMMARY OF DATA ON BERPERAN με ο ABRERA 


TEST FOR THREE 


Grade IV Grade V 
í f 


Grade III 


Scores? 
2 


55 - 59 
50 - 54 1 i h 
45 - 49 5 is 28 
40 - 44 13 -- : 
35 - 39 14 FA FF 
30 - 34 22 5 E 
25 -29 21 2 
20 - 24 H 
15 - 19 8 
10 - 14 2 
Total No. of e 
Subjects 100 - "P 
.65 38. 05 41. 95 
Mean = 2 6.10 6.50 


111 


112 


JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE IV 


SUMMARY OF DATA: ITEM ANALYSIS FOR EXPERIMENTAL TEST 
ON CLASSIFYING IDEAS 


(100 Subjects in Grades III and IV) 


Grade III Grade IV Grades III - IV 
(Combined Data) 
No. of No. of No. of 
Z-Score for Items Items Items Items 
»12 = 1. 95* 12 26 1 
. 1.96 - 2.57** 23 11 1 
2.58 - 6.36*** 21 19 54 
Total No. of Items 56 56 56 


** Significant at the 5 percent level 
****Significant at the 1 percent level 


TABLE V 


SUMMARY OF DATA ON EXPERIMENTAL INDEXING 
TEST FOR THREE GRADE LEVELS 


Grade III Grade IV Grade V 
Scores f H f 
50 - 54 1 
45 - 49 " 9 
40 - 44 2 10 27 
35 - 39 14 25 22 
30 - 34 15 27 26 
25 - 29 17 16 12 
20 - 24 26 10 3 
15 - 19 14 4 
10 - 14 12 1 
Total M D Ts 
Population 100 100 100 
Mean 24.95 32.70 
SD 5.90 6.85 p 


(Vol. 22 


23 December 1953) 


SERRA 


TABLE VI 


SUMMARY OF DATA: ITEM ANALYSIS FOR EXPERIMENTAL TEST 
ON INDEXING IDEAS 


(100 Subjects in Grades III and IV) 


Grade III Grade IV Grades III - IV 
(Combined Data) 
No. of No. of No. of 
Z-Score for Items Items Items Items 
.00 - 1.95* 27 19 2 
1.96 - ο. 5133 20 32 1 
2.58 - 6. 36*** 9 5 53 
Total No. of Items 56 56 56 


aaa 


* Not significant 
** Significant at the 5 percent level 
***Significant at the 1 percent level 


TABLE VIII 
SUMMARY OF DATA ON FINAL CLASS- 


TABLE VII IFYING TEST 
SUMMARY OF DATA: FINAL INVENTORY MN "— 
L————— 
ON BACKGROUND OF INFORMATION Grade IV 
(100 Fourth Grade Subjects) Scores f 
Scores ------------- 
————'áÀ 45 - 49 12 
Grade IV 
Sors .-.--- - 40-44 29 
91 - 100 63 35 - 39 24 
81- 90 28 80 - 34 28 
πι - 80 3 25 - 29 6 
61- πο 4 20 -24 5 
51- 60 2 16 - 19 1 
Ἔν 10 - 14 1 
Total Population 100 = 
Mean 90. 54 Total Population 100 
SD 9. 86 
μμ ου... Mean 36. 75 
SD 6.90 


lc ce uen το κιτς 


113 


JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE IX 
SUMMARY OF DATA ON FINAL INDEX- 

ING TEST 

Grade IV 

Scores f 
40 - 44 8 
35 - 39 26 
30 - 34 34 
25 - 29 13 
20 - 24 12 
15 - 19 5 
10 - 14 2 
Total Population 100 
Mean 31.10 
50 5.10 


SUMMARY or RELATIONSHIPS: ΒΑ. 


CKGROUND OF INFOR ON, 
CLASSIFYING, IN μα. 


DEXING, AND INTELLIGENCE 
(100 Fourth-Grade Subjects) 


Instruments 


ith Binet a 
Indexing Test with Binet 


7€ Test with Indexing Test = 


+ It It I+ 


1 


0 
0 
. 020 
.93 . 013 


(Vol. 22 


December 1953) 


(o) soona on classifying test and (2) analysis of 
, Scores on Classifying Test. —The frequenc: 
ώμος , With means and standard devia- ý 
ns for grades three, four, and five are pre~- 

Sented in Table III. ᾿ 
a ea indicate that the 56 items inthe 
abiliti ying Test were discriminating among the 
A ies of pupils in three grades. 
üs éd elysis of Items. — The statistical technique 
ieee ae te noe significance of items for the 
diff classifying was the standard error ofthe 
erence between proportions. 
TNT inne of the data on item analysis for 
is gr ssifying Tests for grades three and four 
esented in Table IV. 
"A oe findings on the significance of items at 
third or 1 percent level of confidence were: (1) 
Eau PE subjects, 79 percent; (2) fourth grade 
hive cts, 53 percent; and (3) combined data for 
and fourth grade subjects, 98 percent. 


E , 
Experimental Test on Indexing Ideas 


in The data from the experimental test on index- 
i) τ. will be presented in two categories: 
r " η . 
of ems on classifying tests, and (2) analysis 
ai gores on Indexing Test. — The frequency 
fons ο οσον with means and standard devia~ 
Sent for grades three, four, and five are pre- 
the τ in Table V. The findings indicate that 
i items in the Indexing Tests were discrim~ 


inatine 1 

a in difficulty among the three grade levels. 

"US Sis of Items. — The statistical technique 
to determine significance of items for the 


ώμος Test was the standard error of the dif- 
miee between proportions. . 
the ira y of the data on item analysis for 
Pre exing Test for grades three and four is 

peated in Table VI. 
the e findings on the significanc 
9r 1 percent level of confide 


e of items at 
ird nce were: (1) 
od giis subjects, 52 percent; (2) fourth grade 
hird a> 66 percent; and (3) combined data for 
and fourth grade subjects, 98 percent. 


Re 
Sults of Tests Employed in the Final Surve 


tine final survey, the three instruments con- 
ed to ed for this investigation were administer- 
the Sta fourth grade population. The results of 
1937 fe net Test of Intelligence, Form L, 
Subje evision, that was administered to the 

ep ie during their first year of school are 
οἱ hp d last. Final Inventory on Backgrou 
8rade rmation was so constructed that fourth 
Perce pupie would answer all items at 88 to 100 
Invento - A summary of the data on the final . 
in the ory taken by the 100 subjects participating 

actual survey is reported in Table VII. 


SERRA 


115 


Final Test on Classifying Ideas 


Data concerning the Classifying Test inis- 
tered to the 100 fourth grade ο... cer d 
ing in the survey are presented in Table VIII. The 
frequency distribution of the scores obtained on 
the fifty items, with the mean and standard devi- 
ation, are reported. 


Final Test on Indexing Ideas 


Data concerning the indexing test admini 
to the 100 fourth grade subjects haere ο 
the final survey are presented in Table ΙΧ. The 
frequency distribution of scores obtained on the 
fifty items, with the mean and standard deviation 


are included in Table IX. 


Stanford-Binet Test 


The Stanford-Binet Test of Intelligence, Form 
L, 1937 Revision, had been administered to a11 
fourth grade subjects during their first year of 
The intelligence quotient range was from 


school. 

72 to 151. The mean and standard deviation were: 
1. Mean 113. 05 
2. Standard deviation 11.62 


Relationships Among Abilities 


The findings from the instruments administer- 
ed to the 100 fourth grade subjects participating 
in the final survey were used to determine rela- 
tionships that exist among various abilities: 


1, Relationship between comprehension of 
verbal abstractions and background of 
information. 

2. Relationship between comprehension of 
verbal intelligence and intelligence. 

3. Relationship between abilities to classify 


and index ideas. 


Products Formula was employed. 


The Pearson 
lationships is presented in Table X. 


A summary of re 
CONCLUSIONS 


Within the limitations of this study, the follow- 
ing conclusions appear to be valid: 


1. There is a high, positive relationship be- 
tween the comprehension of verbal abstrac- 
tions and background of information. 

a. There is a high, positive relationship be- 
tween ability to classity ideas and back- 
ground of information. 

b. There is a high, positive relationship be- 
tween ability to index ideas and back- 
ground of information. 


116 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. 22 2 
C. The relationship between indexing abil- concepts that appear in primary science 
3 ity and background of information c τε materials. 
pee S οσο Mo class i 2. Comprehension of concepts is greater with 
se pupils of superior intelligence than w un 
i igh positive relationshi those of average or low intelligence. o 
E beers Ei of verbal ab- 7 enhance conceptual comprehension for the 
stractions and intelligence, š average or “slow” learner, TRTO, 
a. There is a very high, positive relation- should be made to develop a backgroun T 
Ship between ability to classify ideas of experience for concepts found in instruc 
and intelligence, tional materials. This may be arenga A 
b. There isa very high, positive relation- by providing experiences ranging from di b 
ship between ability to index ideas and rect to vicarious ones, 
intelligence, 
C. The relationship between indexing abil- 3. Greater emphasis appears to have been on 
ity and intelligence is slightly lower an inductive rather than deductive approach 1 
than the ability to Classify ideas and to comprehension of concepts at different 
intelligence, « 


levels of complexity. For increased com- 
prehension of concepts, deductive as well 
gh, positive relationship as inductive instruction should be provided. 
classify and index ideas, 


SUGGESTIONS FOR FURTHER STUDY 


3. There isa very hi 
between ability to 


IMPLICATIONS FOR PRESENT PRACTICES 


uld be conducted at various 
d grade levels, Further research could estab- 
This study appears to have certain implica- lish the relationship between the comprehen- 
tions for teachers in their guidance of compre- Sion of verbal abstractions and intelligence. 
hension of concepts at different levels of com- 
plexity, 


ed at the college 1 i 
1, Comprehension of Concepts of science ing and indexing ability of more mature stu- 
found in primary reading materials tends 
to be lacking in young children. More the comprehensio: 
emphasis should be placed on developing 


BIBLIOGRAPHY 


b 
i. Bedwell, Margaret. Com rehension of Con- 
Cepts of Quantity in 


5. Cofer, Charles and Smith, Horsley, ει 
ς Third Grade SocialStud- Simple Inex ensive Clas; d E 
165 Readi; Materials, Master's Thesis. i Sr: TR e 
University of Iowa, 1932, 


ion of Concept Formation " American 
: Exychologist, II (November 1941), pp. 521- 
2 We len Myers, “An Experiment in Vo- í 
cabulary Building, » Journal of Higher Ed- ` 6. Cronbach Lee J, « i i 
u 2 τ wee J. ‘Analysis of Techniques 
ucation, XII (February 1941), pp, 99-101. for Diagnostic Vocabulary Testing, " dong 
3 Buswell, G. T., and Joj nal 


of Educational Research XXXVI 
hn, Lenore, - 
cabulary of Arithmetic, Supplemen e Yo πο κ 


5 (No- 
p. 206-217, 
38 (Chi. 


4. Chard, Ray, and Swartz, Bert D. «A New wd Vocabülari 
Modified Concept Formation Test, ” 


001 and Society, LII (October 1940), pp. 
;" Amer- ur Lobo 
- ican Psychologist, II (June 1941), pp. 274- 
275. 


and Wixted, William G. 
Enriching the Active 


December 1953) 


Vocabularies of College Seniors, " School l 
and Society, XLIX (April 1939), pp. 522- 


9. Dunkel, Harold B. ‘‘Testing the Precise Use 
of Words, ” College English, V (April 1944), 
pp. 386-389. J 


10. Eskridge, T. J., Jr. Growth in Understand- 
ing of Georgraphic Terms in Grades IV to 
VIL, Duke University Research Studies in 
Education, No. 4 (Durham, N. C.: Duke 
University Press, 1939), 67 pp. 


11. Gray, William S., and Holmes, Eleanor. 
The Development of Meaning Vocabularies 
in Reading, Publications of the Laboratory 
Schools of the University of Chicago, No. 
6 (Chicago: University of Chicago, 1938). 


12. Gunderson, Agnes G. ‘‘Provision in Readers 
for Developing Meaning Vocabularies in 
Grades I, II, and III, " Elementary School. 
Journal, XLII (September 1942), pp. 41- 


1 

3, Haefner, Ralph. ‘Casual Learning of Word 
Meanings, ” Journal of Educational Research, 
XXV (April, May, 1932), pp. 267-277. 


«Children's Thinking, d 


$ i 
a Hazlitt, Victoria. ( 
xXx (Decem- 


British Journal of Psychology, 


ber 1930), pp. 354-361. 


1 1 
» Herbers, Sister M. B. «Comprehension Dif- 
ficulties in a Third Grade Reader, ” Elem- 


entary English Review, XVI (February 197 
39), pp. 53-57. 


16, Hockett, John A. “Vocabularies of Recent 
Readers, " Elementar School Journal, 
XXXIX (October 1938), pp. 1127115. 

1 

i: Kelley, victor Harold. “An Experimental 

Study of Certain Techniques for Testing 

Word Meanings,' Journal of Educational 

Research, XXVII (December 1933), PP- 

2771-282. 


xperimental 


18, τα 
Liddell, Glenda Lucille. An E 
aching Word 


Investigation of Methods of Teac 4 
Meanings. Master's Thesis. University 
of Southern California, 1931. — 

on. ‘‘Influ- 


19 
* Long, Louis and Welch, Livingst 
€nce of Levels of Abstractness 
ing Ability," Journal of Psychology, 
(January 1942), pp. 41-99. 


The Meanings Derived by Chil- 


on Reason^ 
ΧΙΠ 


2 
9. Looby, Ruth, 


SERRA 


22. 


23. 


24. 


25. 


26. 


2. 


28. 


29. 


30. 


"ΠῚ 


dren from Words and Phrases in Li 
4 S iterature. 
d s Dissertation. University of Iowa s 
; 


. Marcum, Dixie M. Fundamental Experience 


Concepts and Primary Basal Reading Ma- 
terial. Doctor's Dissertation. enge 
PN College, Nashville, Tennessee, 


Meltzer, Hyman. Children's Social Concepts: 


. A Study of Their Nature and Development, 
Contributions to Education, No. 192 (S » 


York: Teachers College, Columbia Uni - 
sity, 1925). i em 


Mott, Sina M. ‘‘The Growth of an Abstract 


Concept, " Childhood Development, X 
. (March 1939), pp. 21-25. 


Murphy, PaulG. The Role of the Concept in 


Reading Ability. Doctor's Dissertation. 
State University of Iowa, 1936. 


Harry K. The Relative Effect of 

f Vocabulary Drill on Achieve- 
Doctor's Dis- 
1934. 


Newburn, 
Two Methods οἱ 
ment in American History. 
sertation. State University of Iowa, 


F. A. Concepts of Primary Reading. 
l 


Colorado State Col- 
Colorado, 19- 


Ogle, 
Doctor's Disseration. 
lege of Education, Greeley, 


34. 


W. J., Huntington, Muriel, and 
Weeks, Viola. «The Language of Relativity 
as Related to Reading Readiness, " Journal 
of Educational Research, XXXIX (April 


1946), pp. 583-601. 


Arnold; Cushman, Jane E., and 
Landis, Carney. '“Α New Method for Study- 
ing Disorders of Conceptual Thinking, ’’ 


Journal of Abnormal and Social Psycholo 
70-75. 


XLI (May 1946), pP. 


Ritter, Olive pearl. Repetition, Spread, and 
Meanings of Unusual, Difficult, and Tech- 


Meanings οἱ Unusua-, ---------------- 
nical Terms in Fourth Grade Geograph 


Texts. Doctor's Dissertation, University 
of Iowa, 1941. 


Osburn, 


Rashkis, 


Sachs, H. J. ‘‘The Reading Method of Acquir- 
ing Vocabulary," Journal of Educational 
XXXVI (February 1943), pp. 


Research, 
457-464. . 


. Sanderson, Marion. '“Απ Experiment in the 


Developing of Meaning Vocabularies, ” 
Studies and Summaries, prepared by Hugh 


118 


S. Bonar, Manitowoc, Wisconsin, 1941, 


32. Serenius, Carl Arnold. An Analytical and 
Empirical Study of Certain Techni ues Em- 


ployed in Vocabulary Testing. Doctor's 


Dissertation. University of Iowa, 1931. 


33. Shannon, J. R. and Kittle, Marian A. “An 
Experiment in Teaching Vocabulary, ” 


Teachers College Journal, XIV (September 
1942), pp. 1-5. 


34, Sims, Ruth Lytle. **Concept Analysis of 
Primers and Preprimers, " Elementar 


English Review, XV (December 1938), pp. 
302-305. 


35. Sims, V. M. “The Reliability and Validity 
of Four Types of Vocabulary Tests, " Jour- 
nal of Educational Research 


> XX (Septem- 
ber 1929), pp. 91-96. 


36. Springman, John H. A Study of Sixth Grade 
Pupils’ Understanding of Statements in So- 
cial Studies Textbooks. Doctor’s Disser- 
tation, 


Colorado State College of Educa- 
tion, Greeley, Colorado, 1941, 


98, Stolte, Helen B, The Ability of Fourth Grade 
Children to Comprehend Certain Geograph- 


ical Concepts, Master's Thesis. State 


JOURNAL OF EXPERIMENTAL EDUCATION 
39. 


40. 


41. 


42. 


43. 


44, 


45. 


(Vol. 22 


University of Iowa, 1935. 


Tate, Harry L. “Two Experiments in Read- 


ing-Vocabulary Building, " Modern Lang- 
uage Journal, XXIII (December 1938), pp. 
29-37, 


Traxler, Arthur E, “Improvement of Vocab- 
ulary Through Drill, ” English Journal, 
XXVII (June 1938), pp. 491-494, 


Waring, Doris Vivian Adams. An Evaluation 
of the Extensive and Incidental Methods of 
Teaching Vocabulary. Master's Thesis. 
University of Michigan, 1939. 


Waters, Betty. A Teaching Project in Lang- 
uage Vocabulary Enrichment. Master's 
Thesis. State University of Iowa, 1939. 


Wiederteld, M. Ther 
Study in Develo, i 
iness with Fourth G: 
in Education, No, 3 
Hopkins University 


iesa. An Experimental 
History Reading-Read- 


rade Children. Studies 
1 (Baltimore: Johns - 
Press, 1942). 


Zeller, Dale, Relative Importance of Factor 
9f Interest in Reading Materials for Junior 
High School Pupils. Teachers College Con- 
tributions, Νο. 841 (New York: Bureau of 
Publications, 


I Teachers College, Columbia 
University, 1941). 


Zipf, George Kingsley. «The Meaning-Fre- 
quency Relation of Words, " Journal of Gen- 


eral Psycholo » XXXII (October 1945) 
bp. 251-256. i 


A STATISTICAL ANALYSIS OF CERTAIN ED- 
UCATIONAL VIEWPOINTS HELD 


BY TEACHERS’ 


DAVID G. RYANS 
University of California at Los Angeles 


ve UCATIONAL VIEWPOINTS quite 
in Cremas are presumed to be important factors 
and h ermining what shall be taught in the school 
The. it shall be taught. 
educati composite of educational viewpoints, or 
tendent of « philosophy, accepted by the superin- 
resu of schools and his administrative staff 
oard mably with the approval of the school 
the co and, in turn, reflecting the sentiment of 
of the amait) and applied with the cooperation 
of anne, corps, determines the objectives 
mitted ing to which the school system is com " 
Will e E is expected that individual teachers 
cated nduct their classes in keeping with indi- 
p objectives. 
Osoph Course, the educational viewpoints, OT phil- 
confor, , of individual teachers may or may not 
the te m to those of the school system in which 
of Ml er is employed. Furthermore, because 
Of the of real understanding of the implications 
the es gue held, or inability to tran? late 
aps Wpoints into classroom behavior, or per- 
er m external pressure to the contrary, @ teach- 
the = not conduct his classes in keeping with 
eve Wpoints he holds about e 
rtheless, in terms of probable inference, 


We f 
Would expect teachers committed to certain 


8 A 
specational philosophies to behave differently in 
mitted ed school situations than teachers com” 
With Ὁ other educational viewpoints. , 
Women, Such a postulate in mind—that teachers 
MM viewpoints are reflected in their 
ertaker behaviors—the present study was un^ 
Organise in an attempt to better understand n 
al zation of teachers’ educational viewpoints. 
ag were made to determine how 693 tain 
th ints of teachers might be intercorrelated 
erel e extent to which such viewpoints might be 
other ben around some central core, OT» on the 
hand, might be characterized by multi-di 


™ensionality. 


ων 
Tes ἐπαλγοιο reported was carried out as one of 
and the Characteristics Study. The Teac 
f e American Council on Education. 
ornia at Los Angeles. 


ducational matters. 


her Characteri: 
ventral office 


Hypothesis 


It was hypothesized that the educational view- 
points of teachers tend to be organized into sev- 
eral relatively independent clusters with respect 
to academic achievement standards, curricular 
organization, pupil participation in class plan- 
ning, and similar matters—that a teacher's 
viewpoints with regard to certain significant ed- 
ucational practices may bear little or no relation 
to his viewpoints regarding other significant prac- 


tices. 


Procedures 


othesis: (1) two forms of an Ed- 
s Inquiry were constructed and 
ctively to elementary and sec- 
lled in summer session clas- 
ses in geographically scattered colleges ofteach- 
er education; (2) responses to the Educational 
Viewpoints Inquiry were tabulated and tetrachor- 
ic intercorrelation coefficients computed between 
the items of each form (elementary form and sec- 
ondary form); and (3) each of the two resulting 
correlation matrices (one based on the responses 
hers and the other based on the 


of elementary teac 
responses of secondary teachers) was factor an- 


alyzed. 
The instr 


To test this hyp 
ucational Viewpoint: 
administered respe 
ondary teachers enro 


ument employed for sampling educa- 
tional viewpoints, the Educational Viewpoints In- 
quiry, was a short form made up of 20 items, 
each item forcing 2 choice between contrasting 
viewpoints regarding educational practices. The 
form of the Educational Viewpoints Inquiry em- 
ployed with elementary school teachers is shown 
in Figure ds 

As may be noted from Figure 1, items includ- 
ed in the Inquiry were devised to sample view- 
points with respect to: (1) curricular organiza- 


r of researches conducted in connection with the 


a numbe ç 
stics Study is a project of the Grant Foundation 


s of the project are located at the University of 


Vol. 22 
JOURNAL OF EXPERIMENTAL EDUCATION ( 
120 


Figure 1 


EDUCATIONAL VIEWPOINTS INQUIRY 
(Elem. ) 


(Spec. 6-1-50) 


i i i e that 
ive your reaction to each of the following questions by marking an X in the box beside the respons 
ρα corresponds to your viewpoint, 


1. In your class do you prefer to have a penmanship period, 


( ) separate from the language expression period? 
) combined with the lan, 


{) participate in planning the school 
) be well informed regarding the sc 
program? 


Program? Ν T 
hool Policies, but not encouraged to Participate directly in 
3. In Planning units of classwork do you believe that 


( ) this should be largely the res 
( ) the Suggestions should come 


( ) most of the reading content related to the language expression unit? 
( ) the reading work and the language expression work largely Separate? 


5. In your Class do you believe it is better 


( ) to have Several activities £oing on at once? 
() generally to have one activity in Progress at α time? 


6. In planning the Social studies Content 


for your Class do you believe 
( ) the content sho 


uld be Prescribeq and relative] close]: 
( ) considerable variation should be alloweq in θες όν 


© content? 
7. Is the average third grade Child capable of "'Self-contro]» in ti i i ethe . 
term in referring to older persons? mm the sense in which we usually us 
( ) Yes 
( ) No 
8. In your class do you think it is better to have 


an 
— 
= 
Oo 
B 
σ 
[ο] 
lzi 
z 
o 
n 
m 


Out of School activities > 
(0) No 


December 1953 
) RYANS pi 


( ) a single unified group? 
( ) several smaller reading groups? 


11. In the third and fourth grades do you think it is preferable to have 


( ) special teachers for all art and music instruction? 
( ) art and music taught exclusively by the regular classroom teacher ? 


12. If it were necessary to use a minimum school day for your class and you were free to plan the activ- 
ities, do you believe it would be better 
( ) to keep approximately the regularly allotted time for reading and number work, limiting some- 
what the time for social studies? 
() to keep approximately the regularly allotted time for social studies, limiting somewhat the time 
for reading and number work? 
itted to visit classroom during regular class hours? 


13. Do you believe that parents should be perm 


( ) Yes 


( ) No 
an be achieved by making the class itself com- 


14. Do you feel that effective control of pupils in class c: 
pletely responsible for such controls? 


( ) Yes 

( ) No 
15, If it were necessary to use a minimum school day and you were fre 
lieve it would be better 


e to plan the activities, do you be- 


me for reading and number work, limiting some~ 


( ) to keep approximately the regularly allotted ti 
what the time for art and music? [ 

( ) to keep approximately the regularly allotted time for art and 
for reading and number work? 


music, limiting somewhat the time 


16. Do you believe that you should have considerable freedom to modify courses of study or units of work 
TOm class to class? 


( ) Yes 
( ) No 
15 do you believe 


17 
ely high standards of pupil achievement in the subjects 


At the third and fourth grade leve 


( ) it is important to set and require relativ 


taught? ; 
( ) academic achievement is relatively unimportant as € 


ompared with other objectives? 


18. pg you feel that classroom responsibilities and administrative responsibilities 
ould be clearly define 
by all personnel re 


supplementing and complementing classwork 


d and separated in the most effective school pro- 


( ) are fairly distinct and sh 


gram? 
( ) overlap and should be participated in 


19 
* In the third grade do you believe that home study 


gardless of particular assignments ? 


() is generally desirable? 
is generally undesirable? 


0, P 
Do you believe that third grade chi! " 

ni 
pet academe ὃς academic ac. 


2 dren should 
rds before beginning the work of the fourthgrade? 
( ) he required to meet presc hievement before beginning fourth grade 
) not be requiredto meet minim 
Work? 


(Vol. 22 


02 
96 n 61 
6 Ῥὲ 81 
əs se ge n 
09- 6I- og- gg- 91 
8 
E SI ο νε 85 n ST 
S 8T- αἴ- zi- p- 6z og- PI 
a 80- SO-  PI- @- GL ο τε eI 
É 0g 9€ 60  PI- I1  £0- ει e ΟΙ 
g I£- p0- pZ- 82- γα Of Of GI Gt 6 
& sl 60 48: 12 αι εἴ — $0-  8z- S0-  c&- 8 
B $0- 60- IT — 10 82 Th 80 Of cz S  90- L 
5 U To 896 p © 0D τε  6I- . 9I- CO gz- 9 
É Z0- 81 — GI στ €0 PI ZZ- ε 6G G€&- OS  TI0- g ε 
8 I£- Tg 8z- Th €0 Fr 80 99 0- LZ  82- τε ο oz- z 
oz ο of at of Sf P» of ο 6 8 b 9 € g 
sIZ=N 


122 


(SHFHOVAL AYVLNANWATA) SLNSIOLIJSOO NOLLV'ISHHOO OIHOHOVHLLXIL 1Ο XINLVIN 


ISW'ISVIL 


December 1953) 


RYANS 


TABLE II 


CENTROID FACTOR MATRIX (ELEMENTARY TEACHER SAMPLE) 


I II IH IV v νι a 

2 -63 14 22 -10 34 -10 $5 
3 51 -39 18 19 39 -10 ” 
5 259 09 06 -20 08 22 45 
a -20 2 42 27 15 -16 M 
8 49 -36 33 16 05 T " 
9 -53 14 -23 35 -10 07 49 
a -35 -08 10 22 -35 T d 
17 63 30 18 15 -09 16 58 
18 49 31 18 07 -11 -17 j 
18 47 28 -34 21 22 16 53 
-23 -30 04 15 


TABLE III 


ROTATED PRIMARY FACTOR MAT RIX—OBLIQUE SOLUTION 
(ELEMENTARY TEACHERS) 


I π m 
-12 43 -30 -10 20 01 
-17 09 67 02 18 -02 
-10 18 -39 -02 08 27 

18 57 00 03 -05 -11 
-06 06 64 -02 -05 24 
-12 15 -29 31 -34 -04 
-16 05 00 -02 -48 06 

57 13 14 22 05 12 

47 09 08 x 04 -17 

37 02 -06 54 23 00 

-06 -18 -07 14 02 


123 


JOURNAL OF EXPERIMENTAI, EDUCATION 
124 


i : upil participation in class 
Im x * Ceo mr Standards; 
(pem aA ος responsibilities; (5) 
oe participation in the educational p tm 
and (6) class, or course, planning and proce 
= number of item rationales, or hypotheses, 
were developed in each major category and ma- 
terials were selected for apparent representa- 
tiveness and Significance, , The items went 
through a number of revisions, based upon the 


means were available 
pirically; at least, not i 


em analyses against Somewhat removed external 
Criteria related to certai 


Pupil behavior i.e, 


> Pupil behavior in a teacher's 
class, Systematic an 


d responsible Classroom be- 
vior on the Part of the teacher, Sympathetic - 
understanding behavi 
Carried out for other 


Purpose of the 
esults (the correlation of 


ucational Viewpoints Inquiry 
of teacher behavior) may be 


Sex, extent of teaching experience, extent of 
training, and the like, Such items Provided con- 
trol data, 


O 


(Vol. 22 


Eight of the teacher education institutions wene 
public universities, one was a state college, v 
one was a private university. The responses ο) : 
both experienced teachers and Jd 
ing were obtained. The ratio of experienced to 


inexperienced teachers in the Samples employed 
was approximately 2 to 1. 


Upon recei; 
tables were s 
Sponses and t: 
Were computed between e; 


cated for the elementary t 
Secondary teacher sample, 


Prior to the undertaking of factor pom aa! 
the items and intercorrelations were reviewe Ν 
and some eliminated due to disaproportionate ac 


one response over the alternative re- 
Sponse. (For example, 330 of the 338 secondary 
School teacher respondents answered “yes” to 
the question: 


» 2, 7, 9, 10, 11, 12, 14, 17, 18, 
19, and 20) made up the correlation table for the 
Secondar 


c y form of the Educational Viewpoints In- 
quiry, 


Factor analyses were accomplished by the 
centroid method described by Thurstone. 
Results 


Were extrai 


Six centroid factors 
simple Structure 


Subsequently rotated to oblique 
e first factor 


: Seemed fai ly tohave 
to do with emphasis upon eee functions 
of the teacher — belief, Perhaps, in the import- 
ance of the «fun mentals. Factor 3 also 
Seemed to reflect a So-called ‘traditional’? sub- 
Ject-matter Curricular emphasis, Factors 1 and 
were correlated to he extent of 0. 44. 
Factor 2 had Only two Significant loadings. It 


Dec 
ember 1953) RYANS 


TABLE IV 


TRANSFORMATION MATRIX (ELEMENTARY TEACHER SAMPLE) 


— απ mz HH πο νο T 
1 39 -22 42 08 21 -01 
H 89 41 -74 34 24 -10 
n 17 60 36 -40 -05 26 
IV -06 43 38 60 -58 -22 
ν -07 47 06 31 75 10 

12 -10 51 03 93 


TABLE V 


CORRELATIONS BETWEEN PRIMARY FACTORS (E 
TEACHER SAMPLE) 


LEMENTARY 


I 100 

II - 31 100 

mm 44 - 22 100 

IV - 13 - 28 13 100 

Y - 13 - 09 21 13 100 
- 06 - 08 -21 - 20 


125 


JOURNAL ΟΕ EXPERIMENTAL EDUCATION (Vol. 22 


126 


Ic 61 
8£ eT 8T 
89 Lg 86 AT 
LI- ST- 80- S0- PI 
6ε Sc £0- Lg 81- [41 
I9 40 [11 IG ET- Sc IT 
Ic pI- S0 8T pI- £I 96 ΟΙ 
ζζ- €0- oT- pI- 6£ 40- 6z- 0g- 6 
oT- £0- 81- L40- 9T 0I- Le- c0 eT- L 
L0- Lg- 81- pg- [43 eI- Tp- L£- 86 90 6 

ec 8T [44 0c gp- ec 9£ [24 80- ET- Gb- £ 
p- ζζ- 9g- σε- ος 60- ζε- 61- 7ε 60 Τε 81- [4 
Sz- L0- 6I- Lg- 8T £I- 8£- 9I- 6€ or ST 9I- 8T I 

0c 6T 8T LT PI [41 II or 6 L 6 £ [4 I 


ΒΕΕΞΝ 
(SHHHOVAL AHVONOOSS) SLNSIOLIJSOO NOLLVISNHNOO OIMOHOVULAL JO XIHILVIA 


IA S'IG VIL 


127 


RYANS 


December 1953) 


GL 
ΤΡ 
σε 
[27 
09 
[22 
68 
8v 
66 
9€ 
€9 
ΕΥ 
66 


τς 
zu 


61 [41 81- or SP L9 0€ 
86- Eg- vc- τό TI σε 6Τ 
IZ- er ΤΙ 0c- 61 [44 8T 
81- 86 60- ος 66 v9 LT 
yl- 96 61 111 [42 8v- [41 
vc TZ- LI- ee vt 6€ e 
61 81 τε 81- 56 6L TI 
L6 56 [44 9I 8c- 8€ oT 
ST- LZ- 66 96 [43 Ty- 6 
£0- ££ 0£- 9% 0z- τζ- L 
vc S0- T- η, OF LS- 8 
£0 97- τς yI Lī- σα £ 
ασ 80- La SI E τα- σ 
Ti- 80- 10 τό a0- tp- T 
IA A AI πι 0 I 


(ATdWVS UAHOVAL AHVINOOSIS) XIULVW YOLOVA CIOULNAD 


ΠΛ S'IHVIL 


128 


JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE VIII 


ROTATED PRIMARY FACTOR MATRIX—OBLIQUE SOLUTION 
(SECONDARY TEACHER SAMPLE) 


-12 
2 -11 35 26 -04 -11 16 
3 -07 00 -15 41 -34 04 
5 14 07 17 -58 -06 00 
7 -07 -01 45 01 34 03 
9 -08 65 -01 11 -11 -15 
10 03 -02 16 48 08 51 
11 52 -06 -33 23 -12 40 
12 24 06 26 10 -42 07 
14 25 50 13 -11 42 06 
ut 68 10 00 07 14 00 
18 32 -08 -36 -01 T -06 
19 14 00 -06 -01 -10 -46 
20 73 -03 08 -07 


(Vol. 22 


.. δω. 


‘~—— 
E 


December 1953) RYANS 


TABLE IX 


TRANSFORMATION MATRIX (SECONDARY TEACHER SAMPLE) 


I π πι IV v ντ 
1 39 -26 -24 21 -21 09 
π 82 43 -05 -46 -09 -07 
πι 05 66 73 56 -08 06 
IV -18 55 -38 62 -01 36 
v 38 00 26 08 84 48 
VI 09 -09 44 -01 πα X "8 


TABLE X 


TWEEN PRIMARY FACTORS (SECONDARY 


CORRELATIONS BE 
TEACHER SAMPLE) 


I 100 

H - 42 100 

IH 08 - 28 100 

IV 57 - 57 12 100 

v - 15 06 - 01 - 09 100 

vi - 38 27 - 35 - 4T 06 100 


129 


130 JOURNAL OF EXPERIMENTAL EDUCATION 


appeared to have something to do with the teach- 
er's democratic viewpoints toward parents and 
pupils and his willingness to recognize the pos- 
Sible contributions of such persons. 

Factor 4 had only one Significant loading, this 
having to do with belief in the desirability of home 
Study. (This response also contributed signifi- 
cantly to factor 1. It seems doubtful that factor 
4 need be further considered.) 

Factor 5 had only two significant loadings and 
was difficult to identify in terms of its psycho- 
logical meaning. 

Factor 6 appeared to be of no great Signifi- 
cance, having no factor loading as high as 0.30. 

As has been noted already, the factors extrac- 
ted were correlated, The correlations between 
primary factors derived from the matrix trans- 
form suggest that while several clusters of view- 


to the several factors extract 
therefore, that among the various j 


aps with regard 
to the matter of academic emphasis, 


Secondary Form of the Educational Viewpoints 
Inquiry 


The range in magnitude of the intercorrelations 
between items comprising the Secondary form 


Correlation 
coefficients extending from 0. 02 to 0. 68. 


Again, six centroid factors were extracted 
and rotated to Simple structure, The resulting 
factors were oblique and were even less well- 
distinguished from one another than those result- 
ing from the analysis of the elementary form of 
the Inquiry. 

Factor 1 was fairly well-defined and appeared 
to have to do with subject-matter emphasis and 
academic Standards— similar, it would appear, 
to factor 1 of the elementary analysis. 


(Vol. 22 


Factor 2, stated negatively, seemed to be re- 
lated to the teacher's belief that most coursecon- 
tent must be developed Systematically around 
basic principles or facts rather than around “real 
life situations. " Factor 9 was correlated 0. 42 
with factor 1. 

Factor 4 seemed to be icentified with view - 
points that embrace ingleness of purpose and 


Factor 6 is closely related to factor 1 and 4, 
there being a Sharing of significant factor load- 
ings with respect to items number 10 and 11. If 
the factor loadings of items 10 and 11 are disre- 
garded, this factor might be described in nega- 
tive terms as having to do with approval of home- 
Work for high school Students, Factor 6 is cor- 
related 0.38, 0.27, and 0. 47 respectively with 
factors 1, 2, and 4, 

Factor 3 is not clearly defined, but appears to 
be related to willingness to share responsibility 
and to accept the Cooperation of others. 

Factor 5 seems to have to do with the teacher's 
faith in the capabilities of his Students and will- 
ingness to accept them as individuals. 

As in the case of the elementary analysis de- 
Scribed above, the intercorrelations between the 
factors suggest that further combination of the 
items might reveal more Satisfactory bases for 
individual evaluation than the obtained factor 
cores. Items 11, 12, 17, 19, 20, 9 (in negative 
terms), 3, 5 (negative), 10, and perhaps 19, ap- 
pear to form a Somewhat meaningful pattern re- 
lated to subject-matter centered teaching and em- 
phasis upon Standards of achievement. 

Α second cluster of viewpoints, perhaps of 
Nisi importance, may be identified with factor 


Tables VI through X present the basic results 
Showing respectively for the Secondary teacher 


mple the matrix of correlation coefficient, the 
centroid factor matrix, t 


ations of Viewpoints are sug- 
However, the relatively 
ariance that could not be 
of group factors indicates 
ty on the part of the items 


December 1953) RYANS 131 
making up the Educational Viewpoints Inquiry. to place teachers in school systems with educa- 
Α composite score based upon responses to the tional philosophies more or less similar tothose 
Inquiry might reasonably be employed to indi- held by individual teachers. But the results of 
ο. the general tendency of the teacher to asso- the factor analyses presented here suggest that 

€ himself with so-called ‘‘modern’’ educa- the viewpoints contributing to such a total score 


tional viewpoints as contrasted with viewpoints 
that sometimes have been called ‘‘traditional. ’’ 
Such a score might even have use in attempting 


are fairly heterogeneous and tend not to be highly 
intercorrelated. 


CURRICULUM FOR PRIMARY TEACHERS" 


SINA M. MOTT 
Southern Illinois University 
Carbondale, Illinois 


Introduction 


uer LIZING THAT the personality and 
aredi. onal background of the primary teachers 
for Mey, i sen atte that there is need 
cheerful y, vigorous teachers with friendly, 
ος iim personalities who have a sense of hum- 
that ied like children and who are liked by them, 
prossesi is need for well-groomed, intelligent, 
inform opal minded ladies who are not only 
total ch in their special field but see it inthe 
and ior ee picture and interpret it toothers, 
ον) there is need for alert teachers who, 
ολ modern trends, are able to work with 
the me community for the best welfare of 
tion for nin student branch of the Associa- 
Ὀτόνηνει hildhood Education at Southern Illinois 
india has made a study of nursery-primary 
um. 


Statement of the problem 


ήν fields in which this research is 

each are: (1) What is the Felt Need of the 
ments fie Themselves; (2) What are the Require- 

fered Certification; and (3) What is Being 

st in the Colleges and Universities. This 
itto is being omitted inasmuch as it seems 
Cause Sible to obtain an accurate summary be- 
Ways οἱ the diversity of titles and variety of 

in which the courses are written up. 


Procedure 


d; " 
What is the Felt Need of the Teachers Them- 
Selves? 


ena discovery of the felt need of the teachers 
made eded in two steps. First, a study was 
are wae the teachers of Southern Illinois who 
Petree ne in the field, and second, taking ἃ 
as m: entative group of these teachers, ἃ study 
awoke ade of their activities from the time they 
bed On Monday morning to the time they went 
Inca Sunday evening. 

ο μα. to gain a picture 
items wan Southern Illinois, 
ried? as needed: (a) Are they single or mar- 
live in (b) Do they have dependents? (c) Do they 
Parente; room, an apartment, a home, ΟΥ with 

? This study is made of four hundred 


of the primary 
information on three 


twenty primary teachers, nursery, kindergarten, 
first, second, and third grade. Inasmuch as sou- 
thern Illinois has no large industrial section, the 
four hundred-twenty primary teachers in the 

towns and villages around Southern Illinois Univer- 
sity well represent the total number of primary 
teachers of all southern Illinois. 

Are They Single or Married? —Of the four hun- 
dred twenty primary teachers, two percent are 
nursery teachers, 11 percent are kindergarten, 

31 percent are first grade,, 30 percent are second 
grade, and 26 percent are third grade. They all 
teach in the public schools with the exception of 
the nursery teachers. 

All of the nursery teachers in this group are 
married. (It should be noted that the nursery 
schools are private.) There are 50 percent or 
half of both the kindergarten and the first grade 
teachers who are married, and 54 percent or more 
than half of the third grade teachers are married. 
Only in the second grade did we find fewer than 
half or 41 percent of the teachers married. Thus 
a program planned to meet the needs of the single 
teacher will not meet the needs of half the group. 

These figures emphasize the fact that to plan 
a curriculum to meet the needs of the primary 
teacher, we must plan for the married woman. 
More than that, we must provide refresher 
courses for those who, having been out of school 
a number of years, need to become acquainted 
with the best in methods. 

Do They Have Dependents ? — The dependents 
listed in order of frequency are: children, mother, 
father, both parents, and invalid husband. Where 
both the wife and husband are working, the hus- 
band is not considered a dependent. 

Fifty-eight percent of the four hundred twenty 


mary teachers are supporting or helping their 


pri 
the support of one or more individuals 


husbands in 


besides themselves. 
Those who plan a primary curriculum to meet 


the needs of the primary teacher must take into 
account the fact that more than half of these girls 
will be either supporting or helping to support one 
or more individuals besides themselves. 

Those who plan a primary curriculum to meet 
the needs of the primary teacher must take into 
account the fact that more than half of these girls 
will be either supporting or helping to support 
one or more individuals besides themselves 

Where Do They Live? —Only six denk " 


Li 
4 
Study made possible by a Research Grant from Southern Illinois University. 


194 JOURNAL ΟΕ EXPERIMENTAL EDUCATION (Vol. 22 


these teachers come from one room, and only 
five percent come from the parental home. It 
would be of interest to learn how many of these 
are aiding in the support of the parental home. 
Only nine percent come from apartments. (Re- 
member the study is made of teachers in south- 
ern Illinois where there are no large industrial 
areas.) By far the largest number, 80 percent, 
come stepping out of homes with market baskets 
ontheir arms. The traditional book bag has dis- 
appeared. 

The second step became that of determining 
how these primary teachers are able to be wives, 
mothers, daughters and housekeepers, besides 
teaching school for eight hours a day. In order 
to determine this, 37 kept a record of what they 
did each half-hour from the time they got up on 
Monday morning until they retired on Sunday 
night. This was done for one week by each of 
the 37 primary teachers. These 37 teachers 
were selected so that they represented the 420: 
four live with parents at home, two in rooms, 
four in small apartments, and 27 in their own 


in college and another a parent in a sister's 
home); nine live in homes where the husband is 
absent (in the army, divorced, or dead); and 
17 live with husband and family. The 37 area 
cross section of the 
Southern Illinois, 
half-hour intervals 


37, for the seven days, was then 
data for Monday is in the Monday Time Schedule 
at the end of this article, 

For the sake of Space and clarity, the num- 
ber doing each activity is not recorded here. Al- 
80 it will be noted that often an activity carried 
over several half-hour periods. 

The morning activities which have been re - 
corded on the time Schedules for the five days 
are much the same. They vary most in the ones 
for whom the primary teacher cares for before 
school: the baby, children, invalid mother, fa- 
ther, or husband; and in School: nursery, kin- 
dergarten, first, second and third grade. It is 
the time after school which varies more from 
teacher to teacher and from day to day. As the 
Monday schedule shows, Monday evening is 
largely spent in the home. Tuesday afternoon 
activities within the school building include fac- 
ulty meetings and conferences with parents or 
principal; while for the evening such activities 
as Sunday School parties, missionary meetings, 
Business and Professional Women’s Club, Απι- 
erican Association of University Women, Re- 
becca Lodge, Association for Childhood Educa- 
tion, Extension class, kindergarten Club, and 
primary teachers’ club are listed. Wednesday 


follows the Monday pattern in that it is largely a 
home night. In the evening there are a few more 
church activities listed. The activity which char- 
acterizes Thursday as apart from Monday is clean- 
ing the cupboards, the kitchen, or the bathroom 
instead of the laundry or the ironing. The Thurs- 
day evening list includes: study for extensio n, 
reading, listening to the radio, playing cards or 
visiting. The morning activities for Friday may 
lock step with the other four days but recreation 
characterizes the activities from two o'clock in 
the afternoon through the evening. In school may 
be found Recreational Music, Story Hour, Indus- 
trial Art, Household Arts, and Free Activities. 
Among the evening activities may be found dinner 
out, movies, games with the children, beauty 
parlor, theater, card party with friends, and tel- 
evision with invited guests. 

It will be noted upon studying the Saturday 
Schedule that many of the activities which the gen- 
eral housewife does during the first six days of the 
Week are crowded into Saturday. In order to do 
this, the housewife-teacher must be an expert in 
household management. She must be an efficient 
worker and a careful planner, 

With the stress of the week's school and house- 
hold work behind them, these housewife-teachers 
(note Sunday Schedule) on Sunday become house - 
wives resting an hour longer before arising. But 
once up they carry through in the usual efficient 
manner, for Sunday finds 90 percent of these mar- 
ried and single teachers attending either Sunday 
School or church or both. Twenty percentof these 
Work in some capacity as Sunday school teacher, 
organist, pianist, helper in young people's activ- 
ities, or choir member. Planning for the whole 
of the primary teacher, Sunday worship and morn- 
ing devotions, as well as recreation, should be 
Considered, 

These data which were recorded every half- 
hour from the time the primary teacher got up 
in the morning on Monday until she retired Sun- 
day evening may be Summarized as follows: 


1. Personal 


1. Personal hygiene: so 

paste, shamp 

2. Recreational: 

a. Reading (The £roup did not list what they 
read.) 

- Music: Organ 


ἌΡ, cosmetics, tooth 
90, home permanents 


» Piano, voice 
- Radio, television 


b 

c 

d. Movies, Theater 

€. Games: with childre 

f. Correspondence 

8. Clubs: American Association of Univer~ 
; American Legion Auxiliary, 

r Childhood Education, 


Professional Women, Re- 
; etc. 


devotion, church, Sunday 


n, cards with adults 


T 


—— 


Decemb 
er 1953) MOTT 


TABLEI 


NURSERY-PRIMARY CURRICULUM 


135 


Certification Requi 
i equirements “Felt Needs" 
Listed as Subjects Listed as Content E73 


General Education 


Art and Music 
Music appreciation, piano em 
P Simple melodies, rhythmic activities 
— Knowledge and appreciation of old and new 
masters, design, sketching, painting, 
Ind . weaving (emphasis on creative art) 
Industrial Art Knowledge of tools, wood, metal, clay, 
leather 
English 9-15 
Written Letters, articles (magazines, papers), radio 
scripts, stories, grammar, poetry 
Oral Reports, story telling, choral reading, round 
table discussion, parliamentary rules, usage 
World Literature Essay, poetry, novel 
Modern Literature Essay, poetry, novel, knowledge of magazines 
. (professional, literary) 
Children's Literature Stories, poetry, knowledge of magazines 
Health and Physical Education 5-8 
Body Health. Hygiene of home, personal health, first aid, 
good grooming, morning inspection 
Physical Education Gymnastics, callisthenics, dancing 
Adult Games Basketball, soft and volley ball, square danc- 
ing, tennis, bowling, cards, checkers 
Children's Games Rhythm games, outdoor games, indoor games 
Human Development 18-30 
History Human race, national, state, province 
Government Group control, school room, national 
Sociology Family, neighborhoods, employment, race 
prejudice 
Psychology Mental abilities, laws of learning, retention, 
mental hygiene, etc. 
Science 12-20 
Agriculture Gardens, pet husbandry, room plants 
Botany, Biology Aquariums, herbariums, plants, flowers 
pirds, insects . . 
Chemistry Acids, alkalies, cosmetics, toilet articles, 
soaps, cleaning compounds 
Household Management, meal planning, marketing, sew- 
ing, laundry, cleaning 
Physics Heating and plumbing systems, electric appli- 
ances, lighting — 
Mathematics Budgeting, taxes, insurance 


Daily devotions, Old Testaments, New Testa- 


Religion 
ment 


196 


JOURNAL OF EXPERIMENTAL EDUCATION (Vol. 22 


TABLE I (Continued) 


Certification Requirements “Felt Needs” 


Semester 
Listed as Subjects Listed as Content Hours 
Professional Education 


Human Growth and Develo ment 


4-8 
Child Development Family group, parental care, behavior 
πια Yevetopment 


patterns in physical, mental, social and 
moral development (birth to 3 years) 
Child Psychology Behavior patterns in physical, mental, 


social, and moral development (3 to 11 
years), guidance 


Teaching, Learning Techniques 7-18 
Reading Experience charts, oral, thought, problems, 


phonics, vocabulary, printing, writing, 
book building 


Arithmetic Development of concepts, printing, drill, 
experience centered, practical use 

Social Studies Agencies, institutions, leadership, cooper- 
ation 

Science 


Aquariums, herbariums, excursions 


Public Education History of educat 
=e sducation 


ion, philosophy, school law, 
faculty meeting; 


S, professional organizations 2-3 


Student Teaching Nursery-primary level, participation; guided 
and full responsibility 
Electives 


— ο Ἂν, AN 


December 1953) 


School, Young People's Meetings, Wom- 
en's Missionary, Choir, Guild, Sunshine 
Club, etc. 


I. Semi-Vocational 
T Marketing: food, clothing, furniture, etc. 
- Household Management 
a. Laundering, washing, ironing 
b. Cleaning: vacuuming, scrubbing, dust- 
ing, etc. 
€. Meal planning and cooking: breakfast 
lunch, dinner, supper, school lunches 
and teas. 
d. Sewing, mending, embroidering, cro- 
Cheting, etc. 
€. Caring for heating and plumbing sys- 
tems 


Ill. Vocational 
* Science: natural, social, physics, PSY~ 
j Chology (general and child) 
* Art: music, fine and industrial arts 
- English: literature, grammar, usage, 
4 Journalism 
- Physical Education: Games, playground 
5 Supervision 
* Arithmetic: primary (nursery to third 
6 prade), budgeting, taxes 
» Advanced Study Classes: correspo. 
" Extension, music 
* Professional Organizations: 
ings, Primary Teachers Club, Kin 
garten Club, Association for Childh 
Education 


ndence, 


faculty meet- 
der- 
ood 


MOTT 


137 


2. What are the Requirements for Certification? 


The first step in this second part was that of 
writing to each of the forty-eight State Superin- 
tendents of Public Instruction for their certifi- 
cation requirements. Where a state does not 
have the primary certification requirements the 
one for elementary teachers was sent and is used 
in this study. 

The certification requirements, as set forth 
by the states, are tabulated and a curriculum 
containing those items which the majority include 
is constructed. The underscored curriculum re- 
corded in Table I is the summary of these find- 


ings. 
Summary 


The Student Branch of the Association for 
Childhood Education at Southern Illinois Uni- 
versity then made the summary of the certifi- 
cation requirements into a frame. Into this 
frame they have written the ‘‘felt needs of the 
primary teacher". This is found on the right 
in Table I. 

This summary sheet was then checked by 80 
summer school students—primary teachers who 
came for summer school—at Southern Illinois 
University in the Primary Methods Classes. It 
is their decision that IF such courses as chemis- 
try, physics and mathematics will be reconstruc- 
ted to meet the ‘‘felt needs of the primary teach- 
er" it is alright to include them, but as they now 
stand in many colleges and universities they should 
be omitted and new courses constructed which will 
be of value to the housewife-teacher. 


198 


5:00 
5:90 


6:00 


6:30 


7:00 


7:30 


8:00 


8:30 


JOURNAL OF EXPERIMENTAL EDUCATION 


MONDAY SCHEDULE 


Arise and partly dress 
Fix breakfast and listen to radio 


Turn on radio for news 

Morning hymns 

Arise and prepare breakfast 

Eat and listen to news 

Dress and start breakfast 

Up and prepare breakfast for six 
Up and turn on heat 


Read Bible 

Prepare breakfast 

Pack three lunches 

Up and straighten house 

Up and dress 

Start coffee and read paper 
Eat and straighten room 


Eat breakfast 

Dress and prepare for school 
Feed dog and walk him 

Dress and feed baby 

Take daughter to University 
Prepare children's lunches 
Lay out clothing for children 


Dress for school 

Start for school 

Breakfast 

Drive two miles for nurse 
Wash dishes and make beds 
Iron a blouse 

Straighten house 

Dress and make beds 

Take husband to work 
Leave for school 


Build fires at school 

Clean out furnace at school 
Dust and arrange room 
Prepare materials 

Drive three miles to school 
Finish dressing 
Preparation and inspection 
Take baby to mother 

Get the maid, go to school 


Prepare work for day 

Greet children, prepare work 
Supervise play 

Check workbooks 

Sell candy 

Supervise free play 

Wash Sue's face and comb hair 


8:30 


9:00 


9:30 


10:00 


10:30 


11:00 


11:30 


(Vol. 22 


Wash Johnny's feet and give him sox 
Send sick to nurse 

Inspect and sew button on child's dress 
Inspect and call parents 


Opening exercises 
Inspection and call parents 
Inspection and order milk 
Music 

Reading, workbooks 
Reading and drill 

Reading and phonics 
Excursion 


Arithmetic and drill 
Excursion 

Arithmetic and science 
Science and art 
Demonstration lesson 
Industrial art and numbers 


Reading experience chart 
Music 


Numbers and recess 
Recess and health 

P. E. and Reading 

P. E. and science 

Recess and letter writing 
Recess and printing 
Morning lunch and rest 


Numbers and music 
Art and science 
Lunch and rest 
Recess and science 
Recess and milk 
Excursion 
Numbers and art 
Rest and numbers 
Recess and music 


Numbers and Story 
Rest and numbers 
Science 

Excursion 
Experience Chart 
Industrial art 

Art and numbers 


Numbers and Science 
Industrial art 

Music and numbers 
Numbers and lunch 
Preparation for home 

lean up room and lunch 
Industrial art and clean up 


December 1953) MOTT 


MONDAY SCHEDULE 
Ῥ. Μ. 


12:00 — Lunch 3:30 — Industrial art and spelling 
Prepare and eat lunch Spelling and music 
Home and lunch Preparation of room for next day 
Shop and lunch Cleaning and leaving 
Supervise lunch Conference 
Social studies 
12:30 Lunch Leave to shop 
Supervise play . 
Prepare for P. M. classes 4:00 pn pay bills 
ostoffi 
1:00 Reading Groceries 
Story Play with baby ; 
Inspection and send to nurse Get child from mother's 
Music and story Rest 
Number work and drill Start supper 
Phonics and number Feed father 
Sci 
Resting ang Eo 4:30 Prepare room 
Art and — Clean and fix furnace 
κών. Sweep and prepare for next day 
1:30 industeial art camer 7 and go home 
pery re dramatization Kindergarten teachers’ meeting 
doles ps dancing Faculty conference 
Scie Ὃ dala t Go to postoffice 
ίσα κ Shop and go home 
Printing and writing : repare supper and wash dishes 
Writing and art e PUMA P cime and wash dishes 
H roceries 
2:00 Rest and lunch Genero 
Rest and art 5:30 Eat 
Rest and music á Prepare supper 
Thee play Wash dishes 
Science Straighten house 
Recess and art Groceries 
Excursion 
2: : Supper 
:30 Rest and lunch 6:00 rier supper 
Recess and art Wash dishes 
Recess and music Feed baby and eat 
Rest and art Feed and walk dog 
Science and P. E. 
Drill and music 6:30 Take nurse home 
Prepare for home i Eat and wash dishes 
Clean up and leave Start laundry in basement 
3:00 play with baby 
Science Help children with school work 
Work period 
Industrial art 7:00 Give small son bath 


Clean up room 

Prepare for next day 

Spelling and writing 

Close school 

Conference with parent 
Conference with principal 
Prepare for next day and leave 


Study for extension 
School work 

Sell tickets at School 

Sell candy at school game 
Iron 

Extension 


140 


7:00 


7:30 


8:00 


8:30 


JOURNAL OF EXPERIMENTAL EDUCATION (Vol. 
MONDAY SCHEDULE 
P.M. cont’d 

Take baby to see grandmother 8:30 Plan meals for the next day 
Straighten up house Write note for maid 
Lay out clothes for next day Help girls with baths 
American Legion Auxiliary 9:00 Laundry 
Lay out clothes for morning American Legion Auxilliary 
Iron Prepare for next day school work 
Laundry Extension 
Prepare school work Read paper and magazines 
Put away laundry Tron and radio 
Sell tickets at school Help girls with bath 
Sell candy at school Prepare son for bed 
Extension class Plan meals 
Help boys with lessons Radio and rest 
Put baby to bed 

9:30 Finish laundry 
Laundry School work 
Prepare school work Tron and radio 
Extension class Extension class 
American Legion Auxilliary Prepare for bed 
Study for extension 
Sprinkle clothes 10:00 Take bath and pin up curls 
Sell tickets at school Prepare for bed 
Sell candy at school Lay out clothes for family 
Iron Prepare dress for next day 
Radio, mend, read Read and write letters 
Lay out baby clothes Launder stockings and to bed 
Help children with bath Watch television 
Laundry 10:30 Pick up around the house 
American Legion Auxilliary Launder Stockings, etc, 
Prepare work for school Bath and to bed 
Read paper Read the Bible and to bed 
pen candy κ στων Read and write letters 

xtension class To bed 

Iron while listening to radio Hes us — 
Radio and help children 11:00 Retire 
Radio and sew 


Bath and to bed 


22 


December 1953) 


5:30 


7:00 


7:30 


8:00 


12:00 


12:30 


1:00 


MOTT 


SATURDAY SCHEDULE 


Arise and dress 
Turn on heat 


Start coffee 
Read paper 
Arise 

Eat 

Go for maid 
Launder 


Arise and start breakfast 
Prepare breakfast, radio 
Laundry, start coffee 
Eat, Laundry 


Eat breakfast 

Dress, put on coffee 
Prepare breakfast 
Laundry 

Arise, prepare breakfast 
Feed baby, dress her 


Wash dishes, launder 
Breakfast 

Change linens 

Launder, prepare breakíast 


Launder, beds 

See that son gets dressed 
Wash dishes, laundry 
Wash dishes, laundry linens 
Pin up hair 


Rest 

Wash dishes 
Dinner 
Shopping 
Prepare lunch 


Rest 

Play outside with baby 
Wash dishes 

Shopping 

Dinner 

Clean house 


Plan school work 

Put baby to rest 
Shopping, visiting 
Rehearse vocal lessons 
Shopping 

Bath, dress 

Wash windows 


A. M. 


9:30 


10:00 


10:30 


11:00 


11:30 


2:00 


2:30 


141 


Laundry, dishes 

Clean kitchen, defrost refrigerator 
Scrub floor, hang clothes on line 
Shampoo girl's hair, put it up 

Shop 


Shop 

Change linen, laundry 

Clean house, radio 

Shampoo girl's hair, pin it up 


Shop 

Clean bathroom, watch child 
Clean house, start dinner 
Change linens, radio 


Put groceries away 


Read, rest 
Clean house, supervise son's play 


Clean house, start dinner 
Shopping 

Postoffice 

Prepare lunch 


Prepare lunch 

Clean house, supervise son's play 
Clean house, radio 

Dress to go to bank 

Service station for gas 


Eat, shop 
Prepare lunch 


Plan school work 
Clean house 
Shopping 

Visiting 

Baking for Sunday 
Take maid home 
Take girls to movie 
Shampoo hair 


Plan school work 
Clean house 
Shopping 

Visiting 

Baking for Sunday 


Bath, dress 

Put away clothes 
Shopping, cleaning 
Baking for Sunday 
Lunch 


142 


4:00 


4:30, 


5:00 


JOURNAL OF EXPERIMENTAL EDUCATION 


SATURDAY SCHEDULE 


P.M. cont'd 


Shopping with friends 
Ironing 

Visiting 

Practice on piano and organ 
Read papers 

Clean porch, Steps, etc. 


Shopping with friends 

Dress baby and give her juice 
Visiting 

Practice on piano and organ 
Ironing 

Pick up girls and take them home 
Cleaning 


Shopping with friends 

Take baby Shopping 
Visiting 

Practice on piano and organ 
Ironing 

Bath, dress 


Shopping with friends 
Shopping with baby 

Visiting 

Practice on piano and organ 
Take dog out 

Plan Sunday dinner 

Bathe and dress son 


Dinner out 
Shopping with baby 
Visiting 

Dinner 

Radio and mend 
Rest, read 
Prepare dinner 


Dinner out 

Dinner, prepare, eat 
Care for father 
Radio and mend 
Company 

Prepare Supper 


Dinner out 
Dinner 
Prepare supper 
Radio, mend 
Wash dishes 
Company 

Play with baby 


7:00 


8:00 


8:30 


9:30 


10:00 


10:30 


(Vol. 22 


Dinner out 

Dinner, wash dishes 
Supper, radio 

Radio, mend 

Read and play with baby 
Company 


Cards and visiting 

Clear table and wash dishes 
Iron dresses 

Study Sunday School lessons 
Take baby to grandmother 


Cards and visiting 

Play with son and baby 
Plan Sunday with husband 
Wash dishes, read 


Cards, visiting 

Put to bed Son, baby, girls 
Sunday School lessons 
Read paper 

Radio, television 


Entertain 

Cards, visiting 

Movie, television 

Check clothes for Church 
Sunday School lessons 


Entertain 

Cards, visiting 
Movie, television 
Bathe and go to bed 
Visit with husband 
Sunday School lesson 


Cards, visiting 
Entertain, television 
Movie, radio 

Visit with husband 
Read, go to bed 


Home reading 
Entertain, movie 
Television, radio 
Bathe and go to bed 
Visit with husband 


Bed 
Entertain, radio 
Movie, television 


December 1953) 


6:00 


8:00 


2:00 


4:00 


5:00 


Turn on heat 
Arise and dress 


Dress 

Prepare breakfast 
Sprinkle clothes 
Clean room 


Read paper 
Breakfast 
Feed father 
Wash dishes 
Make beds 


MOTT 


SUNDAY SCHEDULE 
A. M. 


9:00 


9:30 


10:30 


12:00 


Study Sunday School lesson 


Arise 


Read 

Sleep 

Dinner 

Feed father 
Wash dishes 
Radio 

Put baby to bed 


Read 

Sleep 

Iron son's clothes 
Read Sunday paper 
Wash dishes 


Driving 

Visiting 

Reading 

Wash dishes 
Straighten up house 
Television 


Visiting 

Driving 

Radio 

Television 

Study extension 
Read Sunday paper 
Visit grandmother 


Prepare supper 
Radio 
Visiting 


5:00 


6:00 


8:00 


9:00 


10:00 


143 


Dress for church 
Wash dishes 
Dress baby 


Sunday School lesson 
Teach class 
Dress for church 


Church 
Play organ 


Dine out 
Prepare dinner 


Study extension 
Visit grandmother 


Supper 

Radio and play with baby 
Visiting 

Wash dishes 

Feed father 

Television 

Walk dog 


Reading and visiting 
Planning for next week 
Radio, church, canasta 
Eat left overs from dinner 
Wash dishes and television 


Planning for next week 

Radio, church, movie 

Church, bathe baby and put to bed 
Church and Young People's Meeting 
Television 


Planning for next week 
Church 

Radio, visit with husband 
Fix fires for the night 
Put girls to bed 

Movie and bed 


Bed, radio 


A SIMPLE COURSE EVALUATION SCALE 


RALPH MASON DREGER 
Florida State University 
Tallahassee, Florida 


Introducation 


lane L ROUGH teachers and administrators have 
to det een aware of the need to evaluate courses 
Chers. ο the effectiveness of methods, tea- 
thera’, and materials utilized in the curriculum, 
vatin still exists a demand for methods of eval- 
ende 8 college-level courses. (4) Among other 
Dem S to meet this demand one of the most 
tion FPE is the large scale program of evalua- 
the 1 n the General Education movement under 
eadership of Dr. Paul Dressel. The results 
and 45 Program should be of value to those in 
The of the General Education movement. (6) 
wash evaluation scale described in this article 
Montel Uy constructed on the basis of the 
Well al need for evaluation mentioned above as 
structs in answer to personal needs of an in^ 
which Or. Other scales have been developed 
at are more adequate from some standpoints; 
Or e Ora's (1) technique for evaluating teachers 
grade o mple, could be generalized beyond the 
c ,, School level on which it was standardized. 
found rs of the scale presented here has been 
ministry; reside in its simplicity, its ease of ad- 
On th ation, its production of frank judgments 
Stati; i ον of students, and to some extent its 
E ical manipulability. 
ny college or university class which employs 


So; [ 
meth, Combination of lecture and discussion 
Set ms can use the scale. And any teacher can 
: s class if he 


ove. trends represented in hi 
The y, how to take a mean and S 
i riter has utilized the technique Ο 


tandard deviation. 
ver a per- 


10; 
situs: Several years and in two different school 
tions. Also, a group of teachers in anarea 
make statis- 


General s 
i Education loyed it to 
* employ 
Othe Comparisons among their various classes. 
Th individual teachers have made use of it. 

© Scale is administered as a separated ad" 


de 

Sheet m to the final examination. That 15, th a 
lately Containing the scale is handed out immed 
With utter the final examination is distributed 
they (Re request that students fill it out, but that 
fus. Dine the sheet in a separate pile and in d 

is kajq rder from the examinations. mphasis 
taken tos the necessity for, and precautions ar 
«Qui to preserve, anonymity. In the section on 


litative Uses” a further statement is made 


obtained by the scale under these circumstances. 

Recognition is fully accorded to the need for 
other forms of course evaluation. The present 
scale is offered as only one means of evaluation 
albeit an important one, the student's frank rating 
of important factors in the course. 


The Scale 


Graphic rating scales have been found to be 
satisfactorily reliable and easy to use. (5, 7) 

The ‘‘Student’s Evaluation of Course” is a simple 
graphic rating scale with students acting as 
«judges. ”?” Figure 1 presents the latest revised 
version of the scale. When a supplementary text 
is employed, a third item between Numbers 2 and 
3 has been included. (See page 147.) 

Also, in item Number 6 (Number 7 if the sup- 
plementary text item is included) the statement 
was changed for one term to: ‘‘The course has 
made a change in my philosophy of life." From 
student comments the wording in Figure 1 appears 
to be preferable. 

On the whole, the use of an invitation to make 
comments, following item Numer 6, has proven 
to be more productive than merely leaving the 
blank with no suggestion. 

Items Numbers 4, 5, and 6 are purposely set 
up to indicate a balance if the student feels the 
course is balanced. Varying of the ‘‘most favor- 
able" point is also intended to avoid the tendency 
e a mark on each line at about the same 
place. The influence of the totality of the course 
and the instructor cannot help being felt from one 
item to another, though the **halo effect" appears 
to be less than might be expected in such a scale 
as this one. (See section on “Quantitative Uses.’’) 

Though the writer is aware that the scale as 
set up violates several principles considered best 
for rating scales (5), in particular, using defin- 
ite marks along the rating line, and not having 
descriptive phrases between the extremes, the 
purposes of the scale were held more important 
than these considerations. It was thought, pos- 
sibly unjustifiably, that the use of marked points 
wouid encourage use of any part of the scale rather 
than just the white space in the middle. And since 
the scale is meant for a quick expression of feeling 
at the close of the final examination when time is 
of the essence, the fewest descriptive phrases 


space 


to mak 


Con τ : 
Cerning the frankness of expressed feelings 


* Th 
e Ps; 
dag thor wishes to express his thanks to er a3 


and to Dr. Theron Alexander, Jr., Dre 
criticisms. 


nology Seminar of Florida State University, and espec- 
aker, and Dre 


Anders Sweetland, for helpful suggestions 


146 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. 22 


Student's Evaluation of Course 
= S a tuation of Course 


Name of Course = 


Please give your best judgment on the following items. Do NOT sign your name. Check on 
each scale where you feel the right place of emphasis is. 
1. The material presented in the course as a whole is: 


Very helpful 


Not helpful at all 
i 1 1 1 1 L 
2. 


The Text is: 


Very helpful Not helpful at all 


3. The lectures are interesting or not: 


Stimulating 


Very dull 


4. The amount of work required for the course was: 


Too little Too much 


5. Thebalance between lecture and discussion has been: 


Too much lecture 


Too much discussion 


The course has made a change in my thinking or philosophy of life: 
Profound change 


No change at all 


FEEL FREE to use the Space below to make comments on any of the above questions: 


C RN 


Figure 1 


Sample of the Evaluation Scale 
1 


(Centered on E X 11 mimeographed sheet, Division points are at each eleventh space. ) 


December 1953) 


DREGER 


147 


3. The Manual, Workbook, or Prospectus is: 


Very helpful 


Not helpful at all 
—L L ip 


Consonant with clarity had to be included. 
dated ooa eters point scale has been ques- 
T serm at times on account of an alleged tendency 
dom. to group toward the midpoint. This 
tee ud does not seem to be great in actualprac- 
Wilte i this particular scale. In both the 
their r E and others’ classes students have made 
NIE oices at the ends of the scale where they 
Fox feel their choices should be. Means of 
mid 1 butions have differed significantly from the 
"Wales again andagain. In other research the 
found. has utilized an eight-point scale and has 
el no manifest differences between sevenand 
ight points. 
"PE ing has been done in tenths of a point, 
whe ui 1.0to 7.0. The score is taken 
ος, e point of the check-mark (y) or the 
ent se of the X (x) is found. Where the stud- 
ther mes καλο a word or a group of words, ra- 
Word o n making a mark, the midpoint of the 
fronted. words has been scored; one is not con- 
tion is with this exigency very often. A sugges- 
Small made by Guilford (5) that scoring units be 
μα than Scale units. No doubt, however, 
ents d of a point are too refined. Yet most stud- 
Points , not mark the scale compulsively at the 

i Printed on it, so some smaller division 

ja oe numbers is needed. A marking key 
iis een utilized at times; estimation of tenths, 
a kc, 2 fairly accurately determined without 
the Y, is recommended, especially in light of 

μας over-refinement of scoring units. 

im far as the writer is concerned, this eval- 
in 2 Scale may be copied by anyone to be used 
Sults 5565. If users care to report on their re- 
ο ada Such a report will be appreciated in order 

to validational materials. 


Uantitative Uses 
e—The first use which the 


wridirection of Attitud 
r found helpful is to determine the direc- 


πο attitude of a class. A mean or median, 
Show ding on the form of the distribution, ca 
idm here the class as a whole is tending. 
the dard deviations and ranges show how widely 
Venient d may differ from one another. A con- 

Figure adaptation of a bar graph helps here. 
© 2 gives an example of this kind of graph. 


m ff 
tero discriminative Judgnents on 


The it 


It is easy to utilize one of the mimeographed 
evaluation sheets to exhibit mean, standard dev- 
jation, and range. 

A suspicion may obtain in connection with it- 
tems Number 1 and 3 that the ‘‘halo effect" could 
very easily operate. The course and the instruc- 
tor may be read in terms of each other. Neverthe- 
less, Pearson product-moment correlation coef- 
ficients between items Number 1 and 3 from two 
groups representing two different schools were 
.508 and . 270 respectively, with the latter not sig- 
nificant. These correlations were in line with ex- 
pectancy, since the first group consisted mostly 
of freshmen who would be expected to be less dis- 
criminating than the second group consisting of 
relatively mature students. Only twenty-five per- 
cent of the variance in item Number 1 is account- 
ed for by variance in item Number 3 in the first 
group of students. 

A bigger question may be asked, perhaps the 
most relevant one, ‘‘Does the scale actually show 
a difference of attitude from one situation to an- 
other?” That is, is direction of attitude actually 
differentiated, or do all classes in all situations 
mark the scale similarly? The answer is that it 
has shown significant differences among Courses 
and between objectively varying situations. 

The group of teachers mentioned in the Intro- 
duction found statistically significant differences 
among their courses. * Tables I and II present 
an analysis of variance of both courses in the Gen- 
eral Education area involved. Table I gives the 
analysis of both courses together on all items, 
Table Π of each course separately on item Num- 
ber 1, the ‘overall item. ’’ 

Even though, however, the scale does manifest 
differences where one might expect them to be 
found, it is consistent in respect to courses taught 
by the same teacher. In the group comparisons 
above, classes taught by the same teacher did not 
differ significantly from one another in spite of 
fairly large differences among classes taught by 
different instructors. In another case, in the fall 
of one year one teacher taught a course in Person- 
al Development to freshmen mainly, a course in 
General Psychology to regular students in a uni- 
versity, sophomores and above, and another Gen- 
eral Psychology course to an Air Force night 
class at an air base. As Table III indicates, no 


han on any other item. (Cf. Table I) 
or are forced by the scale to make 


148 


JOURNAL OF EXPERIMENTAL EDUCATION 


Student's Evaluation of Course 


Neme of Course 


Sign your name. Check on each scale where 
of emphasis is. 


1. 


Please give your best judgment on the following items. Do NOT 


you feel the right place 


The material Presented in the course as a whole is: 


Very helpful | Not helpful at 811 
LI 1 Li 
The Text is; 

: 


Very helpful Not helpful at all 
' 


The lectures are interesting or not: 
Stimulating 


Very dull 


The amount of work required for the Course was; 


Too little | Too much 
Li ' 1 t 
The balance between lecture and discussion ha 


8 been: 


Too much lecture | Too much discussion 
͵ ' 
The course has made a change in my thinki ife: 


Profound change : No change at all 
Li 


FEEL FREE to use the space belo to 
τν quine par w make comments on any of the 


Bar Graph Depicting Mean (vertical line), Standard Deviation 


he bi and 
Range (light bar) on Each Scale Item (heavy ar), an 


(Vol. 22 


Decemb 
mber 4253) DREGER 


TABLE I 


ANALYSIS OF VARIANCE OF SCORES IN TWO COURSES ON 
ALL ITEMS OF SCALE 


Source of Sum of Degrees of Mean 
Item Variation Squares Freedom Square F Ratio* 
1. Between groups 68. 63 8 8.58 6.86 
Within groups 571.20 458 1.25 
2. Between groups 110. 89 8 13. 86 10. 66 
Within groups 583. 91 450 1.30 
3. Between groups 110.57 8 13, 82 19. 68 
Within groups 481.37 443 1.09 
4. Between groups 40. 79 8 5.10 5.73 
Within groups 397.14 448 . 89 
5. Between groups 48. 41 8 6. 05 9.76 
Within groups 274, 37 440 .62 
6. — Betwe 5 223. 14 8 21. 89 13. 68 
voc 455 2.20 


Within groups 
cant at the 1% level. 


* All ratios in this column are signifi 


149 


150 


JOURNAL OF EXPERIMENTAL EDUCATION 


ANALYSIS OF VARIANCE OF SCORES IN TWO COURSES ON ITEM NUMBER 1, 
COURSES SEPARATED 


284 1.37 
B. Between groups 25.16 12.58 11. 98 
Within groups 182.58 174 1.05 
TABLE πι 
ANALYSIS OF VARIANCE OF SCORES ON ITEM NUMBER ΤΙΝ ΤΗΒΕΕ 
CLASSES TAUGHT BY SAME TEACHER 


Source of Sum of 
Variation Squares 


Degrees of 


Mean 
Freedom 


Square F Ratio* 


Between groups 6.99 


3.50 2.99 
Within groups 


164. 88 


* Ratio is not Significant, 


(Vol. 22 


NA 


December 1953) 


Signifi i 
ο difference showed up on item Number 
ong these exceedingly diverse groups. 


Chronolog ical Trends 


nid at any one time the scale is consistent 
that d one teacher, during the several years 
Eva ος Writer has employed the "Student's 
ifested ation of Course” trends have been man- 
exten d 0n the scale which are validated by the 
during ον ρας ον ον κ It was predicted that 
Sertatio e terms in which he was writing his dis- 
Away in the writer’s scales would show a trend 
On item f the normal, inparticular, higher means 
for the umber 1, Figure 3 makes some case 
Co way in which the prediction was borne out. 
er ^r iis classes across a period of a num- 
Which P onines and semesters gave mean scores 
Sion Fiere represented in toto by a linear regres- 
Score E with a slope of -. 04. Withoutthe mean 
ation i the term devoted least to class prepar- 
mean r e line has a slope of -. 06. The first 
ona Mei ως the first term the writer taught 
those p ege level. The other means, apartfrom 
ent e the dissertation period, show a consis- 
only serge. On the other hand, if one takes 
highest points between the first mean and the 
Of ext, (which deviates from the others because 
e beber. circumstances, the author believes), 
pointe bola which almost perfectly fits these 
Scores oes seem to represent an actual trend of 
One; the The external situation is à verifiable 
Itlons, Scores appear to follow the objective con" 
resent | If the extrapolated means actually rep" 
term the trend, one can be thankful no further 
5 had to be devoted to dissertation writing. 
least à tever function fits the points, the 2. 86 at 
TOr of Ppears to be out of place. A standard er- 
Without stimate calculated including 2.86 is .241; 
lies y 2.86 itis .158. In the latter case, 2.86 
the - 675 standard errors of estimate above 
perde dicted Score, considerably beyondthe one 
ent level of chance expectancy. 


Gro 
"b Comparisons 


oi: usually difficult to get teachers to have 
eve ical comparisons made on their classes; 
Ttheless, the scale is useful for such pur- 
Tables I and II offer examples of pe pe 
0 It does seem that for the sca 
sit e οἳ full benefit here the teaching staff ias to 
Such wn with the results and face them honest y. 
teache Procedure was followed by the grouP % se 
Dresen mentioned previously. The results wer 
Possi nted as objectively, and diplomatically, 35 
ible, Discussion included an explanation ^ 
lastheir 


B Baci. 

vay tations of such scales as wel - 
S. Possibly most groups could not be 38 ol 
mbers of 2 


jec i 
tive as this one was. If not all me 


DREGER 
151 


staff agree on common discussion, an administra- 
tor with the consent and cooperation of his teach- 
ers could use the scale and keep its results con- 
fidential. Odious comparisons, first, and com- 
plete dependence, secondly, on any one form of 
evaluation, are, of course, to be avoided. 


Qualitative Uses 


It is of no help to the students involved, for they 
of necessity remain unknown; but on item Number 
6 it is revealing to findthere area few ineachclass 
who check the very extreme at either end. From 
a clinical standpoint, these extreme scores usual- 
ly mark some kind of instability. It may be that 
honestly a person has had such a profound change 
that he would mark all the way to the left; yet it 
is not ordinarily expected that anyone willso mark. 
If, however, a student marks all the way to the 
right, one can surmise that he is fighting against 
the teacher, the course, or both. Any honest ap- 
praisal of any course, it would seem, should in- 
clude recognition of some slight change in one's 
philosophy of life. It is wholesome for a teacher 
to realize that there may be someone who has 
been influenced strongly enough to be very much 
moved for or against his course. 

Validation of the honesty of the ratings, which 
to an extent is a validation of the scale itself, since 
the student is supposed to mark what he feels about 
the course, not what the objective value of the 
course actually is, is found in the written com- 
ments at the bottom of the page below the scale it- 
self. Following are some examples taken from 


various courses: 


“Regarding no. 5 particular teacher required 


2, 2000 word research papers for a 2 hour course, 
‘All others required only 1.” (This person meant 


Number 4. ) 

«This course was very stimulating. "' 

«The time consumed in connection with the 
term paper is not worth the value received from 
the paper. " 

«The text is mediocre, gives many examples 
in place of explanations. Read too much to gain 
too little. '" 

«Lectures were too deep for class. " 

««Enjoyed this course more than I do my other 


” 


ones. 

«1 don’t think that this course should be re- 
quired. j 

«Good course—and good instructor! " 

«The workbook was unnecessary. It only made 
me get mad at some of the questions I couldn't 
answer and took up too much time when the text 
was taking up enough. ’ 

«1 feel that it was too general or hasty and it 
was a complete waste of time. I do not think I 
got much of anything from it, " 


JOURNAL OF EXPERIMENTAL EDUCATION (Vol. 22 


152 


SUIL ]oouog juSr 107 SURE Jo seurq uoisso18ox 


£ enStq 


ΜΒ! 100H2S 
t 


G3L1IWO SINTIVA OML isy7 V'IO8VSvd 


G3LLIWO 9s8'z — emen σαν 
qQ30019NI SINIYA πην ο... 


December 1953) 


‘ x 

T Dogs enjoyed this course very much. Dr. 

"js 1S very good a very intelligent person— 
ery pleasing personality. ’’ 


ity pesce these students felt that their ident- 
οἳ Send not be established. There are enough 
Se comments in every set of scales to im- 


r : 
Press one with the student's belief that he isanon- 


ο and can express himself honestly. 
ured oe which cannot be directly meas- 
dant may be inferred, is that students who 
Chane had a chance to ‘‘gripe’’ are given a 
€ without fear of retaliation. On the other 


DREGER 153 


hand, those students who desire sincerely to 
express their appreciation for a course can like- 
wise have their say without fear of being misun- 


derstood, 


Summary 
A course evaluation scale has been described 


which is simple to use and to assess. Its quan- 
titative uses are to show the direction of attitude 
of classes, chronological trends, and possibly 
group comparisons. Extremes of attitude can 
be found. The frankness of the use of the scale 
by students has been suggested. 


BIBLIOGRAPHY 


1, 
Amatora, S. M. ««Α Diagnostic Teacher- 


Rating Scale, ” Journal of Psychology, XXX 


(1950), pp. 395-399. 


f Dreger, Ralph Mason. ‘‘Self-Ratings of Indi- 
viduals Studied Projectively.'' (To be pub- 
lished, ) 


co 


S Ezekiel, Mordecai. Methods of Correlation 
Sasis, Second Edition (New York: John 
iley and Sons, Inc., 1941), pp. 531 


» 


` Gilmer, B. Von Haller. «Evaluating the Cri- 
eria for Higher Education, ” Journal of 


Higher Education, XX (1949), pp. 473-479. 


5. Guilford, J. P. Psychometric Methods (New 
York: McGraw-Hill Book Co. , Inc. , 1936), 


pp. 566. 


6. Mayhew, Lewis B., et. al. Cooperative Study 
of Evaluation of the American Council on Ed- 
ucation Attitudes Handbook (East Lansing, 
Mich.: American Council on Education, 1951). 


7. Stagner, Ross. Psychology of Personality (New 


York: McGraw-Hill Book Co., Inc., 1948). 


"RESISTANCE ΤΟ EXTINCTION" OF TWO PAT- 
TERNS OF VERBAL REINFORCEMENT 


E. VICTOR MECH* 
Indiana University 
Bloomington, Indiana 


of d E ARE apparent two broad classes 
MUS οι the area of conditioning. Those 
are elete conditioned or unconditioned that 
Spondent ed by specifiable stimuli are calledre- 
Class, c (Pavlovian conditioning). The other 
emitted omprising all those responses that are 
Ko ricis. Make: ve independently of identifi- 
C Yemen may be referred to as operant. 
Betton op ly, most of our behavior in the 
hat it o fairs of everyday life 15 operant, in 
ddr ποτα acts upon the environment to 
Spondent he satisfaction of basic ''π6645. ” Re- 
Served: behavior is much less commonly ob- 
inim seldom, if ever, operates upon the 
Pee hes produce anything. 
rat diced (1938) worked exclusively with the 
ing the "s orcement depending upon the rat press- 
Pressin ar in the Skinner-box apparatus. Bar- 
Witha τ in this situation is an operant. It occurs 
Procedu w frequency prior to any experimental 
increas E that may be applied. Its strengthis 
nerease when it is followed by reinforcement. 
With hi, a strength means merely that it occurs 
Probabil: frequency than it did before. 
is fre ably the best measure οἱ operant strength 
Stron; quency of occurrence. An OP rant is 
of ieee emitted often within a given period 
A pay it is weak when emitted rarely. — 
g sto suitable for Type R conditioning 
com n, along with the familiar Type 8 schema. 
ab ne Uds uf eerie will 
ετων, Later the two type 
factor T ose of this paper let us 355 
(1943) a as do Skinner and Mowrer. 
Ypes fier dian the theory that there exists two 
to ha. learning, Fundamentally, Hull attempts 
ndle both under his set of postulates. ) 


T 
ype R: (Skinner) 
Bi -ν''“''''.. 
T 
ype S: (Pavlov) 


5 
"e 
(food) R (salivation) 


5 pu a 
(tone) = — ———»r (ear twitching) 


ditioning involves the elicitation ofa 
response (salivation) by an identifiable condition- 
ed stimulus (tone) that is under the experiment- 
er’s control. In Type R conditioning, the specific 
stimulus that initially evokes the response (bar 
pressing) cannot be identified. This is indicated 
by the small s of the Type R paradigm. 

For all practical purposes, the response is 
initially emitted, without recourse to any specific 
stimulus source. 

Type S conditioning involves stimulus substitu- 
tion and the formation of a new reflex. The tone, 
in the classical Pavlovian example, comes to act 
as a ‘‘substitute’’ for food in eliciting salivation; 
and tone-salivation is the new reflex. In Type R 
conditioning, however, there is merely the strength- 
ening of à reflex that already exists in the organ" 
ism's repertory. Bar-pressing, for example, 
occurs ina Skinner -box with some frequency prior 
to any reinforcement with food, At any rate, no 
substitutionis involvedandno new stimulus-re- 
sponse relation is formed. 

This brings us to a focal point in educational 
research. At least one educational psychologist 

tly made more than pas- 


(Stroud, 1951) has recen 
sing reference to the need for a systematic behav- 


ior theory in education, aS follows: 


Type S con! 


Undoubtedly, one of the pressing needs 
in our field isa comprehensive learning 
theory (or behavior theory) that will encom- 
pass teaching and learning in school and 
embrace the broad phenomena of personal- 
ity and adjustment. .... A frank attempt to 


extend the cur 
Tolman, ΟΥ othe 
is overdue... (p. 282) 


hand, Brownell (1948) points out the 
futility of attempting to base School practice on 
any contemporary learning theory. The school 
situation, according to Brownell, involves inter- 
action of many factors, and so does not follow 

the simple model of the laboratory experiment 
Brownell also stresses that educational psychol- 
ogy is not an attempt to apply psychological learn~ 
ing theory to education; rather, research must 


On the other 


le 


Dr. Ni 
ng νον Fattu, Director, Institute 
port of basic experimentation. 


*The 
author wish i titude to poth Dr 
es to express his gr& of Educatio 1R 


. Wendell We Wright, Dean, School of Education, and 
esearch, Indiana University, for their continu- 


. 22 
156 JOURNAL OF EXPERIMENTAL EDUCATION (VoL 


educational principles out of studies of 
d problems in their natural complex 
R evident, however, that the theoretic a l- 
experimental psychologist who utilizes infra- 
human organisms and performs operations on 
them, does not, indeed, observe them in their 
natural complex setting. Actually, many inves- 
tigators favor the results of relatively simple 
experiments upon infrahuman organisms as the 
principal inductive base for learning theory. 
This preference is not due to any over simplified 
belief that all of the principles which may event- 
ually be needed to account for complex behaviors 
can be derived from these experiments, but 
rather to a conviction that only from these ex- 
periments can there be confidence that the ef- 
fects of different variables have been isolated. 
Thus it is difficult to believe that the beginnings 
of an adequate theory can take root from any- 
thing but an exhaustive analysis of experiment- 
ally isolated situations. With respect to educa- 
tional theory then, it appears to be contingent 
upon the educational psychologist to utilize test- 
able principles derived from infrahuman studies, 
and to perform operations upon human subjects 
in the relatively ‘‘simple”’ laboratory situations. 


Although the stated educational aims are di- 


rected toward explaining how human organisms 
learn, there has been in vo; 


gue an emphasis up- 
on educational tests. There are thousands of 
test titles, which the testers have limited their 
Collection of facts to the marks put on the papers 
by persons being tested, and to the association 
of these marks with Some criterion which usu- 
ally has success in school as its only referent, 
Although the testing mo 


vement is highly useful 
and practical, it has, unfortunately, not contrib- 
uted to psycho-educational theory. Indeed, it 


has not advanced our knowledge in explaining 
how the human learns 

Clearly, then, a behavior theory for educa- 
tion should take behavior as its datum. The 
primary nature of the paper is to suggest a di- 


rection for the experimental educators to pro- 
ceed toward this goal. 


There has been much w 
rat employing the conc 


ork done with the white 
epts of ‘reinforcement’ 
and ‘extinction’; unfortunately, there is a lack of 
systematic data on the human level, 

Jenkins and Stanley (1950) point out that the 
regular reinforcement of response is, indeed, 
not the world’s rule 2 and this leads us into an 
extremely interesting area, There are various 
schedules of reinforcement that can be adminis- 
tered in the laboratory; for the purpose of this 
experiment, however, only two were utilized: 
Continuous reinforcement and partial reinforce- 
ment. That is, every time in acquisition an§ 
in the continuous reinforcement group gave a 


response belonging to the ‘‘correct’? class, a 


‘‘reinforcement’’ was received. In the arte, 
inforcement group, the Ss received reinforcements 
only 50 percent of the time, or for every other 
*fcorrect" response. The reinforcement sched- 
ules apply only to the acquisition phase of the ex- 
periment. Once the acquisition criterion was | 
reached each S was placed on 5-minute extinction. 
(Absence of reinforcement.) -— 

In measuring the resistance to extinction, the 
ratio of the number of correct responses to the 
total number of responses for 5 minutes was used. 
The number of responses in a complete extinction 
is generally more satisfactory, since any short- 
time criteria excludes possibly important data 
contained in the rest of the extinction curve. On 
the other hand, extinction criteria have theadvan- 
tage of being time-saving and experimentally con- 
venient, and are therefore frequently employed. 


Procedure 


Upon entering the experimental room each S 


was told to read aloud the following printed in - 
structions: 


The task in the experiment is 
one. All you are to do is to continue say- 
ing out loud any numbers that you wish. It 
is important that you continue this opera- 
tion until the experimenter tells you to stop. 
You will be told when the numbe 


Ts you are 
Saying are on the “right” track, Attempt 
to say the number; 


S at a pace that is normal 
for you. 


a simple 


Any questio 


ns that the Ss had were cleared up 
before allowi 


ng them to continue, Actually, then, 
the Ss were emitting verbal responses. It was 
decided beforehand to condition Ss to the response 
class «8^. (Any number containing an eight in 
it.) The verbal reinforcer used was a “right” 
from the experimenter. The criterion used for 
conditioning was 20 serially «correct? responses. 
When the criterion was reached, each S was placed 


on a 5-minute extinction schedule. The basic data 
consisted of the number 


à of correct responses . 
bere in extinction as compared with the total num 
ber of responses made. 


In testing for differences 
the question t ΚΛ 


Ὁ be answer 
tinuous reinforce: 


Subjects 


m a 


December 1953) 


RATIO OF THE 
EACH S IN 


81 5/5 
89 6/11 
83 1/9 
84 8/14 
Ss 1/1 
86 7/19 
S7 19/20 
88 5/5 
8g 9/12 


CORRECT R 


MECH 


TABLEI 


ESPONSES 


THE PARTIAL REINFO 


5-MINUTE 
E 


4 5 Total 


EXTINC 


το THE TOTAL RESPONSES FOR 
RCEMENT GROUP DURING A 


xtinction in Minutes 


3 


TION PERIOD 


2/2 3/3 3/3 18/18 
17/28 14/16 20/26 66/97 
2/4 1/6 9/12 25/49 
4/6 6/12 6/7 27/48 
3/3 3/3 1/1 15/15 
1/2 6/14 8/10 26/56 
3/21 4/18 3/20 45/98 
11/13 5/12 9/10 30/50 
1/9 6/15 7/18 25/62 
14/20 19/23 6/15 57/105 


JOURNAL OF EXPERIMENTAL EDUCATION (Vol. 22 


TABLE II 


RATIO OF THE CORRECT RESPONSES TO THE TOTAL RESPONSES FOR 
EACHS IN THE CONTINUOUS REINFORCEMENT GROUP DURING A 
5-MINUTE EXTINCTION PERIOD 


Extinction in Minutes 


5 Total 
81 16/17 16/20 9/14 8/14 8/22 51/87 
812 16/22 10/23 18/22 8/19 12/23 64/109 
S15 17/19 11/19 9/12 1/9 10/16 48/15 
514 24/29 16/19 7/12 7/11 3/24 57/95 
S15 17/19 12/26 13/20 7/16 2/21 51/102 
516 10/16 11/19 13/21 2/20 6/21 48/97 
81 21/29 6/16 14/18 3/20 2/9 46/92 
S18 19/24 13/19 12/17 4/18 9/28 57/106 
Sig 26/31 11/20 13/17 6/21 2/16 58/105 
820 29/36 16/29 14/28 2/15 4/19 65/127 
Correct 195 128 122 48 58 551 
Total 242 210 181 163 199 995 

TABLE πι 
FREQUENCY oF FAILURES AND SUCCESSES F 
DURING FIVE- T 


HE TWO 
MINUTE EXTINCTION PERIOD κ... 


Failure Success Total 
Partial Reinforcement 264 334 


598 
Continuous Reinforcement 444 551 9 
225 991 95 
Total 708 885 


1593 


D 


December 1953) 


The a, 
age dian pr" was from 19 to 23 with a mean 


Results and Discussion 


Vae el esents the ratio of correct respon" 
tial Reinf otal number of responses for the Par- 
tineti orcement group during a 5-minute ex- 
τμ period. 
tion Fa m inlike manner, presents the extinc- 
Teinforce or the group which received continuous 
σας Dining αμα, 
Correct r S 1 and 2 compare the plot of the mean 
roo  DOHSes HB extinction with the mean 
hice ae in extinction for the partial re- 
C) gro ent (Rp) and continuous reinforcement 
Dlot or "iced respectively. Figure 3 shows the 
tinction fo percent of correct responses in ex- 
Clearly r the two reinforcement schedules. 
tigation 1 y the design encountered in this inves- 
ienee Pu a comparison between two fre- 
8roup of or more precisely two proportions. A 
0 gro twenty Ss was divided at random into 
ü Penis of ten Ss. One group was given par^ 
e din o ment in the acquisition stage and 
Tia]. Moe! reinforcement on each correct 
Parable c er conditioning both groups to a com- 
Sponses riterion (twenty serially correct re- 
Placed ο containing a number ‘‘8’’) each S was 
determi n 5-minute experimental extinction to 
der inven ee the conditioned response un^ 
requent ation occurred proportionately more 
Bearanc y in one group or the other. The ap- 
extincti, e of the conditioned response during the 
The ap On period can be designated as a success. 
extinctio, arance of an inappropriate response in 
Or set Nmay be called a failure. (Any number 
The h of numbers that does not include «gn,) 
ampia isis to be tested then is that the two 
Opulatig C random samples from a common 
lon, and a Chi square test was used to 


eval 

nu B oi the outcomes of the experiment. Table 
for th, WS the frequency of success and failures 

1 nute extinction 


Periog Ὁ £roups during the 5-mi 


T ; 
barely Ghtained Chi square valu 
aoe at prt of the required 3. 84 to be *'signifi- 
ion ¢ 
extinct = are indicative, however, that, 
t ‘leone criterion of five minutes was extended 
Soups rrect'' response differential between _ 
cance les ould have approached a higher signifi- 

Απαὶ oe (See Figure 3.) 

ysis of the data indicate that: 


1. , 
: n reinforcement increases the re- 
ap nce of a response to extinction and A 
ti superior to continuous reinforce 
ent in this respect. 


MECH ` 


159 


2. The extinction curve following continuous 
reinforcement starts with an initial spurt 
in percent of correct responses. After- 
wards, the extinction curve is marked by 
depressions and accelerations in respond- 


ing. (See Figure 3.) 


3. Not only is extinction after partial rein- 
forcement more resistant, but the curve 
is also smoother than that after continuous 
reinforcement. The vascillations which 
come after continuous reinforcement do 
not occur so markedly after partial rein- 


forcement. 


Although the investigation described was rela- 
tively simple the analogue to many complex hum- 
an behaviors is implied. For instance, the great- 
er smoothness of extinction for the Partial rein- 
forcement group might indicate that this ‘‘sched- 
ule’’ of conditioning may perhaps increase a per- 
son's ‘frustration tolerance. ’’ Intrinsic to par- 
tial reinforcement is the recurrence of periods 
of non-reinforcement during which *tfrustration" 
is alternately experienced and overcome by con- 
tinued responding. Occasional reinforcement 
gives stability to behavior, and "persistence" in 


the face of failure. 
Let us take a concr 


periment just describe 
both groups in the process of emitting numbers 
tha number containing an «8, The 


responded wi 

group (Rc) was told “right” every time they had 

responded with a number of the correct class 

(containing an «8?). The acquisition here was 
na previous section, the 


rapid, and as reported i 

criterion for acquisition was twenty serially ‘‘cor- 

rect’? responses. (Those containing an eight. ) 
received a reinforce- 


Remember that this group 
they emitted the correct 


ment (‘‘right’’) each time 
response. The partial reinforcement group was 


told “right” only 50 percent of the timethey emit- 
ted the correct response. For example, the pro- 
cedure was as follows: Each S would eventually 
give a number containing the required “8”, a typ- 
ical report being, — S says **38", E says “right”; 
S says “48”, E is silent; S says ''58’’, E says i 
uright”, -> and so on. In this manner the Ss in 
the Rp group do not meet with success on every 
correct response, and build up a persistence 

to failure. 

The obtained results clearly suggest a fruitful 
direction for a behavior theory of education. With 
the facts of reinforcement ‘‘schedules’’ we now 
have (Humphreys, 1943; Jenkins and Clayton, 19- 
49; Mowrer and Jones, 1945) some intelligent 
predictions concerning educational procedures 
with respect to the control of limited behavior 
can be made. Extrapolating somewhat, it should 


ete example from the ex- 
d. Eventually each S in 


160 


o 


MEAN RESPONSES 


JOURNAL OF EXPERIMENTAL EDUCATION 


TOTAL RESPONSES 


CORRECT RESPONSES 


σι 


| 2 
EXTINCTION 


3 4 
IN MINUTES 


es Comparing the 
ean Responses for 
ment Group 


Mean Corr 


ect Responses 
the Partia 


l Reinforce- 


(Vol. 22 


December 1953) 


MEAN RESPONSES 


25 


m 
o 


ο 1 
ΕΧΤΙΝΟΤΙΟΝ 


Extinction Curves 


e——9 TOTAL 
ο---ο CORRE 


MECH 


RESPONSES 


2 


Figure 2 


Comparing the M 
s for the Continuous Rein- 


esponse 


cT RESPONSES 


3 


IN MINUTES 


ean Correct Responses 


to the Total Mean R 
f rcement Group 


[o 


161 


162 


T 
PERCENT VERBAL RESPONSES CORREC 


Vol. 22 
JOURNAL OF EXPERIMENTAL EDUCATION ( 


100 


80 


60 


40 


20 


0 l 2 3 4 


EXTINCTION IN MINUTES 


Plot of the Percent Correct Responses for Each Minute in 
Extinction for the Two Rei 


nforcement Patterns 


December 1953) 


bes 
to be qe ae one would go about teaching a pupil 
inte s mem in the face of failure. In train- 
πας rl for confidence at work, or for will- 
er εβοι η persist in social activities, the teach- 
some deine that each child is guaranteed 
atfirst b p of success and approval regularly 
will sint Qu later only occasionally, So that he 
the Sei anii up in the face of setbacks. Outside 
means the ory continuous reinforcement is by no 
rei ο... but neither is strictly partial 
ules? of Ir There are more complex *sched- 
fixed-int inforcement (Skinner, 1938) s uch as 
erval and fixed-ratio. It is hardly to be 


MECH 


163 


expected though, that any schedule adhering to a 
fixed number of responses would be honored by 
our complex social environment. Also, it is not 
the purpose of the present paper to discuss the 
relative efficacy of aperiodic, periodic, or regu- 
lar reinforcements in producing behaviors that 
are resistant to extinction as much of the exper- 
imentation has been done with infrahuman organ- 
isms. There is, however, no reason why exper- 
imentally oriented educators should not begin a 
systematic investigation of these operant phenom- 
ena on the human level 


REFERENCES 


Brow 
cur ; W. A. ‘Learning Theory and Educa- 
Sear Practice, ’’ Journal of Educational Re- 
ch, XLI (1948), pp. 481-498. 


Hull 
res ο. Principles of Behavior (New York: 
eton-Century-Crofts, 1943). 


Jenki 
ins, W, O. and Clayton, F. L. «Rate of 


R : 
Thee and Amount of Reinforcement, "' 
nal of Comparative and Ph siological 


Ps 
Psychology, LXxII (1949), pp. 1747181. 


Jenkins, W. O. and Stanley, J. C. ‘Partial Re- 
inforcement: A Review and Critique, ’’ Psy~ 


chological Bulletin, XLVII (1950), pp. 193- 
234. 


Mowrer, O. H. and Jones, H. Extinction and 

Behavior Variability 85 Functions of Effort- 
fulness of Task, ’’ Journal of Experimental 
Psychology, XXXII (1943), pp. 369-386. 


Skinner, B. F. The Behavior of Organisms: 
An Experimental Analysis (New York: Apple- 
Crofts, 1938). 


ton-Century~ 


B. «Educational Psychology, "' Annual 


Stroud, J. 
, I, 1951. C. P. Stone, 


Review of PS cholo; 
Editor. 


υπ... -..-. 
E aud 


163a 


ADDENDA 


Thi i > 
e following are corrections to Victor Mech’s article which is published in this December, 1953 
? Σ 


JOURN. 
ως Sent EXPERIMENTAL EDUCATION, entitled **Resistance to Extinction of Two Patterns of 
rcement. The insertions are underlined, and quotation marks should have been used 


as indicated. 
x KK KX 


The lines includi Τι ndent’’ should 
4 ding “those responses conditioned or iti Οἱ be 
unconditioned. ...are called respond H 
quotes and preceded by, Keller and Schoenfeld (1950, Ρ. 49) indicate that: 


The lines in à 

cluding ‘‘the other class comprisi i ifi i pe i 
ced prising....of identifiable stimuli’’ should be τ 
ed by, Keller and Schoenfeld (1950, p. 49) point out further: Mis ae 


The li ; " 
ώμο including «most of our behavior in the routine affairs of everyday life.... operates upon the 
ent to produce anything? should be in quotes and preceded by the following sentence, In dis- 


Cussi ; ἥ X 
ing the relative frequency with which both types occur in life situations, Keller and Schoenfeld (1950 
> 


P. 49) state: 


The li " " 
Soman including «it occurs with a low frequency prior.... increased strength means merely that it 
σωμα with higher frequency than it did before" should be in quotes and preceded by, With specialref- 
to this concept Keller and Schoenfeld (1950), p. 50) state: 
. 50) that, should precede the quote *tprob- 


< „it is weak when emitted rarely. 


» 


Th ; 
dius PA It is pointed out by Keller and Schoenfeld (1950 
e best measure of operant strength is frequency of occurrence.. 
The 
don Pore beginning, **A paradigm suitable for Type R...." should have been, “A paradigm suitable 
5 ας v conditioning is shown by Keller and Schoenfeld (1950, pp. 47-48) along with the familiar Type 
«the small s of the Type R paradigm." 


Th 
Showa etapa beginning ‘‘Type S conditioning....”’ and ending with 
betwe have been in quotes and preceded by Keller and Schoenfeld (1950, p. 48) describe the differences 
Ween Type S and Type R as follows: 
oses...."' should be preceded by, With respect to the 


Th 
ps ith beginning, ‘‘For all practical purp 
response, Keller and Schoenfeld (1950) point out that 
s stimulus....of a new reflex " should be in quotes and pre- 


Th 
© Sentence ‘Type S conditioning involve 
48) state: 


ced 
ed by Keller and Schoenfeld (1950, p. 
T 
ns Sentence, ««The tone, in the classical....is the new reflex’? should be preceded by, Keller and 


Sc 

hoenteld (1950, p. 48) indicate further that: 
The J; TEN. 
ae lines including ‘In Type R conditioning, however, .... Stimulus~response relation is formed” should 
b in pe ae fal eee E NOE UI UTI uishing property of operant conditioning is pointed out 


Keller and Schoenfeld (1950, p. 48) as follows: 
o any oversimplified belief. . . . exhaustive analysis of exper- 
ted and corrected to read, “A primary reason for the fore- 
olover the variables being manipulated is increased, 


f the obtained results. "' 


T 
imonsentences “This preference is not due t 
Soi ng y isolated situations’? should be dele 
thero, P" eference is that the probability of contr 
Y permitting a more valid interpretation o 
piss Sentences ‘The number of responses in a complete extinction. ...and are therefore frequently em - 
d." should be in quotes and preceded by, The ideal conditions for extinction are discussed by Kel- 


le 
XZ and Schoenfeld (1950, p. 72) who state: 


T " 

he fi TSt and second conclusions under Analysis of the data, should be changed to read: 
1. Fifty percent verbal reinforcement increases 

of a response to extinction when 


the resistance 
compared with 100 percent verbal reinforcement. 


163b 


2. The performance curve in extinction follow- 
ing 50 percent verbal reinforcement is more 
Stable than the extinction c 


urve following 100 
percent verbal reinforcement. 


The sentence ‘Intrinsic to partial reinforcement is the recurrence. ...and overcome by continued τε” 
sponding”? should be in quotes and preceded by Keller and Schoenfeld 


(1950, p. 91) point out that: 


-. in the face of failure” Should be in 
of children is made by Keller and Schoenfeld (1950 


The sentence ‘‘Occasio 


nal reinforcement. x 
An application to the t 


raini 


quotes and preceded by, 
- 91) who state: 

The sentence “In training for skill 

setbacks” should be del 

dren, Keller and Schoe 

at work, or for Willing; 

Success and approva] 


» for confidence at work... so that he will not give up in the face of 
eted and the f i 


regard to building persistence in chil- 
nfeld (1950) state, ; in training for Skill, for confidence 
ness to persist in social activities, that the Child is guaranteed some measure of 
— regularly at first, nally,’ >? 
The sentence “Outside the laboratory. . 


«16 strictly partial reinforcement”? should be preceded by Keller 
and Schoenfeld (1950, p. 98) caution that 


The sentence “It is hardly to be expected though. . 
ceded by Keller and Scho. 


- - by our complex Social environment” should be pre 
enfeld (1950, P. 98) point out further that 
Add to the references: 


Keller, F, S., and W. N. Schoenfeld, 
Co. , 1950). 


Principles of Psychology (New York: Appleton- 


Century-Crofts 


a 


CORRECTIONS FOR 


«The Statistical Interpretation of Degrees of Freedom” 


by 
William J. Moonan 


The Editor and author are sorry for the statistical and typographical errors that 
appeared in the article as published in the March 1953 issue of the Journal of Ex- 


perimental Education. 


Printed line 


; or equation 
1 Page Column number Correction 
259 1 30 applications 
259 2 19 Y +Y =y: 
260 1 (1) -+#0 
260 1 13 Ytj tty +F 
260 2 (5) Y, fai. 212] [V1 
Ya) (221 322)|Y2 
260 2 39 Y,as Y, E s 
( .) 
261 1 (7) Gyt +T = (x 
@?+ ᾧ + GP? 3 
=1 ο 1 * D Viat ον 
261 2 8 Y, - Mn A nl τ i 
=0 +...40 γι 1+1 Yiz ον 
261 2 10 Y; Ἐν E miti τ 
ni " 
261 2 19 GE yin) 
πι 
" 
261 2 (11) Change: Yik to Yij 
261 2 24 n, + n; 72 
261 2 27 mom? 
=1 ly, t...+1Y,; 
261 P. 33 Y, a s 3 n 
n " F 
262 1 a2) εδ... 
23 delete: =i =1 
- f Σ Οἱ τη” 
262 1 31 E yi yi 
262 1 34 Change: andy to and 
262 2 17 αρ -(ν -ᾱν F) a= 59 
2 4 6? + 187 
263 1 16 a 
210 d 
263 1 22 m 209 
51,29 
263 1 27 pee 


p Journal of Experimental Education 


A T 
Volume XXII March, 1954 G TS , 


MEASUREMENT OF WRITING ABILITY AT 
THE COLLEGE-ENTRANCE LEVEL: 
OBJECTIVE VS. SUBJECTIVE 
TESTING TECHNIQUES” 


Number 3 


H 
EDITH M. HUDDLESTON ** 
Educational Testing Service 
New York City 

A: Background of the Problem 
εώς OF THE most fundamental controver- for achievement examinations covering specific 

S in the history of psychometrics has been content (e.g., history or physics) have been 
aup μία on the relative merits of objective vs. somewhat more successful. Stalnaker (34) who 
iv Jective measuring techniques. The subjec- proposed an ‘‘analytical’’ method of scoring es- 
"Lr free-answer tests have always had the ad- say tests as a means of raising reliability, re- 
ap n of apparent or **face"' validity: the tra- ported that for English composition the method 
Pee nal essay examinations are of the work- is unsatisfactory. 3 mane 
enti le type, and require the examinee independ- The objective test has its origins inthis dem- 
eday to summon and organize his relevant knowl- onstrated need for reliability. With the objec- 
um Thus the essay test has been thought of tive test the problem of “reader reliability 
to a “natural” task, allowing a direct approach (agreement of readers with one another) disap- 
it „a Portant goals. ' But as far back as the 18805 pears, and the scoring of the test is 51 mply a 

Was realized that this otherwise ideal testing matter of clerical accuracy. There still re- 
ane Was beset with the pitfall of unreliability mains an error of measurement due to imper- 

w ). Early studies (14, 54) showed that there fect test reliability but it has been demonstrat- 
hee Considerable discrepancy among teachers’ ed that the objective test is generally much more 
arks, anda vast array of later studies confirm reliable. A good objective test has a reliability 

m2 fact, In 1929 Ruch provided a table sum- of at least . 85, and frequently of . 90 or more, 

exi izing 285 coefficients of reliability for essay Historically, however, the objective test has 
| (4 Minations, with a median reliability of . 59 been severely criticized on the grounds that it 

a ty P. 107). i This corroborates the experience presents the examinee with a task which is art- 

wr ter investigators. In 1947 Adkins (1, p. 6) ificially oversimplified. It has been charged 

ita απ “Essay tests, no matter what their mer" that the examinee is inadequately measured when 

Kia] ay. be, are commonly considered imprac- he is required merely to choose his answer 

cc. the number of subjects is at all sizable, from among a number of answer-choices which 

reli? of the great difficulty in scoring them are set down for him. The proponents of objec- 

serei and because of the time required to tive ο en gm ihat such 

Cate em." A review of the literature indi- measurini δα. rently valid in 

is po that the unreliability of essay examinations Ἐν Rasen be me ‘eae 


ds Pronounced in the area of English com^ 
n, while efforts to improve re liability 


ogy 


mo, 
Bositi 


A digs 
ert of Peyohol 
P &tion in the Department ent of tne re 


o and Science in 
Arts à partial fulfillment 
phy, 1969. | 
tB. El 


Ἐκ 
Ὃν author is deeply indebted to Dr. Scot 
then who worked closely with the writer in prep 
aie criterion data, For technical consulta 
bert and Dr. Naomi Stewart of the Graduate 
ec: Ledyard B, Tucker and Dr. Willian W Turnb 
5 and instrictors who participated in furnishi 


Bc: 


ull of 


To a large extent this challenge has been sat- 


submitted to the faculty of the Graduate School 


quirements for the degree of Doctor of Phil- 


Associate Professor of English at Carleton Col- 
and criticising test materials and in developing 
indebted to the following: Dr. Ralph W. 


Ledges 
aring 


hoo; 


ng ori terion data. 


ach= 


66 JOURNAL ΟΕ EXPERIMENTAL EDUCATION 
al 


i i t. Objective-type questions have 
| pepe ene whieh do require thought and or- 
ganization on the part of the examinee; statisti- 

cal analysis has indicated that abilities tested by 
objective tests are frequently similar to thos e 

which essay examinations aim to measure; in 


€ been found su- 


ation are, if not 
solved at the present time." Fora long paper 
on a single topic, the Board readers 


achieved a 
reader reliability of .55 which is, as Stalnaker 
elationship be- 


(Vol. XXII 


tween height and weight, " (52) Six other themes 
(40) were read with reliabilities, μα κών. 
of .67, .66, .83, . 69, .58, and .59. Fur that 
unpublished data in the Board’s files shows 2 
for nine one-hour examinations consisting ο 
three or four short essay questions each, oe 
reader reliabilities for total score ranged fro t 
-68 to .89. Note that the figures below eae 
“reader” reliabilities only; when the animon 
test unreliability is also taken into account, sat 
total unreliability is eyen greater. Itis ma 
ing to observe that the correlations with the a 
bal score of the Scholastic Aptitude Test wer 
almost as high as the reader reliabilities. 


Reader  Correl’n πον 
Test Reliability S.A. T. Verba’ 
September 1944 .89 
December 1944 .85 2617 
April 1945 .69 .55 
June 1945 .81 
September 1945 .88 
December 1945 .82 .61 
April 1946 .68 .58 
June 1946 .69 .55 
December 1946 .68 .57 
Intercorrelatio 


ns among the Board's essay 
Questions tended to be low. In view of the fact 


nglis Composition twic 


1, 25, 28, 49. while some... 
have held out hope for sapere reader reliabil 
ity (29, 45. 58, 58, 60, 61). However, there 
15 no convincing evidence in the literature tha 
these hopes haye come to fruition, 


In contrast to the discouraging status of the 


πό 


παν 


March, 1954) 


es ; 
Beco te in English composition, there have 
of the τών indications of the effectiveness 
nap pire aM ες test. Tests of Englishgram- 
Shown pee ic and tests of verbalability have 
there is £h reliabilities and high validities, and 
ο. a to indicate their superiority. 
tests Pd reliabilities for objective English 
ee representative examples (see page 
tests in e effectiveness of the objective English 
Garces eee appropriate criteria is sum- 
fe gel ae Ῥ studies (page 169). 
Success į and McGann (37) have reported their 
ance pur n using objective English tests for guid- 
him Poses, Studies by Willing (62, 63) led 
errors eins that tests in the recognition of 
dicting e reasonably good instruments for pre- 
Pupils will average number of formal errors that 
Predictj make in free composition, but notfor 
Presse the specific kinds of errors. 
Compared: (43) conducted a study in which he 
of his ow the validity of essay tests with a test 
Jective ite construction. His test included ob - 
Tequireq rege and items in which the student was 
ratings j 9 revise given material. He secured 
teachers. written English from three experienced 
to repre , the average of whose ratings was used 
in Written p the general ability of the children 
18, he det English. With these ratings as criter- 
len, ορ mined the following validity coeffic- 
Scores Pree on narrative composition, . 29; 
9n his ES descriptive composition, .33; scores 
using thera test, .72. A study by Hartson (27) 
Sey’s fin same types of tests corroborated Pres- 
Offer aquungs. McKee (38) and Stalnaker (46) 
tional evidence that objective English 


tests 
While Pippi valid than are scores on themes, 
Tevisi udy by Flemming (18) indicated that a 


ish md showed a higher relationshi 
Hudelso es than did a composition rated by the 
Scale, Willing’s work (62) showedthat 
to free composition 
1 dent weaknesses. 
test gives ard to the charge that an objective 
Bnition S a distorted picture in that it tests rec- 
Qui ind ABD than recall, a study by McCull- 
dence, Flanagan (36) gives contradictory evi- 
tin: Ge investigators found a correlation 
ive Cog, Ween the usage section of the all-objec- 
he usage ΑΗΘ English Test, Form OM, an 
GN, > Section of the Cooperative English Test 
e er Si was largely a correction-of-error 
i mitting free response written in the test 
val orr Using the Wisconsin tests of grammat- 
relation €ctness, Leonard (33) reported a COT" 
ἐ55ρ nés f . 68 between the objective and thefree- 
Or at orm. And Stalnaker (47) concluded 
bility to least one group of students 2 test of 
Same Classify sentence faults gave virtually 
m Score as did a test of ability to corre’ 


The evi 
Evidence with respect to the characteris~ 


HUDDLESTON 167 


tics of verbal-factor tests leads to t - 
esis that the verbal factor may be hose Ces 
value in predicting success in English coe 
Πα, Veral tests are typically highly reliable; 

ple, the verbal section of the C. E. E.B 
Scholastic Aptitude Test was reported to have a 
corrected odd-even reliability coefficient of .96 
(10, pp. 30-33). A factorial study by Carroll 
(9) as well as the Board’s own developmental 
work indicates that the verbal score of the Scho- 
lastic Aptitude Test measures primarily verbal 
ability. Correlations between verbal tests and 
English grades have been reported by several 
investigators (see page 170). 

There is some evidence regarding the inter- 
relationships among objective English tests and 
verbal tests. Crawford and Burnham (11) re- 
ported correlations between verbal and objective 
English tests as ranging between . 65 and . 83. 
Doppelt (13) obtained a correlationof .72 between 
a verbal reasoning test and a grammar tes t. 
Krathwohl (31) found that a vocabulary test cor- 
related . 58 with each of two objective tests in 
English expression. McCullough and Flanagan 
(36) reported a correlation between vocabulary 
and usage of . 69. These correlations are high 
enough to suggest the hypothesis that objective 
English tests are heavily loaded with the verbal 


factor. 

While the pres 
gives some impor 
measurement of w 
necessity for ἃ large 


ent review of the literature 

tant insights into effective 
riting ability, it points to the 
-scale investigation inwhich 
all variables are pooled to determine their inter- 
relationships with one another. Particularly in- 
determinate is the question of whether objective 
English tests or verbal-factor tests are more 
closely related to ability in English composition. 


B. Purpose of the Investigation 


As the preceding review of literature indi- 
cates, the problem of measuring ability in Eng- 
lish composition has been attacked piecemeal 
with the consequence that little light has been 
thrown on its over-allaspects. Tests with vary- 
ing degrees of promise have been evaluated more 
or less effectively, but on separate populations. 
This situation is understandable in view of the 
great expense which would ordinarily be involved 
in a large study combining a number of variables 
and employing à sizeable test population. The 

resent writer is fortunate in having had the ορ” 
ortunity and resources necessary to carry out 
such an investigation. 

The studies reported in the present investi- 
gation were designed to meet the needs indicated 
by the gaps in present knowledge, and were con- 
ducted at the request of the College Entrance Ex- 
amination Board. The initial problem was to 
work with teachers in developing an acceptable 


. XXII 
168 JOURNAL OF EXPERIMENTAL EDUCATION (Vol - 


September 
1944 


December 
1944 


Intercorrelations Among Questions 


1 2 3 4 
Question 1 (. 81) .94 .49 .52 
Question 2 (. 82) - 50 - 56 
Question 3 (. 75) «58 
Question 4 (. 75) 
Question 1 (. 80) .54 48 45 
Question 2 (. 66) .53 41 
Question 3 (. 66) .39 
Question 4 (. 79) 


Reported Reliabilities for Objective English Tests 


Source of 


Information 
uormation 


Asher (3) 


Buros (6, No. 1269, 1) 


California Test Bureau (8) 


Lindquist (35) 


Stalnaker (49) 


Traxler (56) 


World Book Company (64) 


Test Reliabili: 
Kentucky English Test - 93 (retest) 
College English Test: 


Nation- 
al Achievement Test 


: 88 (Spearman-Brown) 


Test of English Usage 


Test of Correctn 
propriateness 


-94, .95 (Kuder-Richardson) 


ess and Ap- -92, . 94 (Spearman-Brown) 
of Expression 


University of Chicago English . 88 -Brown 
Scholarship examination of Gpearman-Brown) 
May 1934 


Cooperative English Test A, 


«98 (Spe EROR 
Mechanics of Expression (Spéarman-Brown) 
(Form R) 


Barrett-Ryan-Schramme] 


.88,. form) 
English Test 8, .89 (alternate 


.94, , 91, .91 (Spearman- 
Brown) 


T —— —Éble" 


.. March, 1954) 


Investigator 
Asher (3) 


Berg, Johnson and 
Larsen (5) 


Cade (7) 


Edmiston and 
Gingerich (15) 


Fletcher and 
Hildreth (19) 


Slattelter (22) 
Hartson (27) 


McCullough and 
nagan (36) 


W: 
CONI and Strabel 


HUDDLESTON 


List of Validity Studies 


Validity 
Test Coefficient 

Kentucky English Test . 13, .62 
Cooperative English Test A, 

Mechanics of Expression 

(Form Q) .69 
Conkling and Pressey Diag- 

nostic Tests in English 

Composition (Form C) .51 

(Form D) .47 
English Usage Test of the 

Ohio State Every Pupil - 


Tests 


Ohio State University Eng- 
lish Placement Test: xi 


Usage Section 
.20 to .29 
Grammar Section .23 
.11 to .22 
Cooperative English Test: 
Usage score .66 
pressey Diagnostic Tests 
in English Composition .63, .43 
Cooperative English Test, 
Form OM: Usage score .54, «62 
i ish Test. 
Cooperative English ? pmo 


Subtest Mechanics 


169 


Criterion 


Freshman English 
grades 


Freshman English 
grades 


Freshman English 
grades 


Freshman English 
grades 


Scores on a compo- 
sition test 


Freshman English 
grades 
Instructors’ ratings 


Freshman English 
grades 
Instructors’ ratings 


Freshman English 
grades 


Freshman English 
grades 


Teachers’ ratings 
(12th grade) 


Freshman English 
grades 


170 


JOURNAL OF EXPERIMENTAL EDUCATION 


Correlations Between Verbal Tests and English Grades 


Investigator 
Carroll (9) 


Crawford and Burnham (11) 


Ellison and Edgerton (16) 
Garrett (20) 


Goodman (23) 


Grinnell (26) 
Hartson (27) 


Landry (32) 


Thompson and Haines (55) 


Wagner and Strabel (59) 


Test 
C. E. E. B. S.A. T. Verbal 


C.E.E.B. s. A. T. Verbal 
Yale Verbal Reasoning Test 


Thurstone’s Verbal Factor 


Thorndike’s CAVD Test: 
Completion Score 
Vocabulary score 
Directions score 


Thurstone’s Verbal Factor 


Inglis Tests of English Vo- 
cabulary 


Inglis Tests of English Vo- 
cabulary 


C. E. E. B, S. A. T. Verbal 


A.C.E. Psychologica] Ex- 
amination: Linguistic Score 


A.C.E., pg 

amination 
Opposites 

Completion 


Ychologica] Ex- 


-3" to .64 
-33 to. 54 


T5 
.93 
.42 
«44 


«40 to .55 


«49 
«41 


(Vol. XXII 


Correlation with 


English Grades 


.39 


March, 1954) 


Meli Bs “ability to write." It must be rec- 
edge ed that in the present state of our knowl- 
todas for a long time to come, this ability 
hensiy e defined in great detail; it is a compre- 
Ovar e ability, and one which can be expected 
and ried in the same person from time to time 
Weir task to task (34, p. 504). Hence, the 
Would b ecided that ratings of writing ability 
e dey e the best type of criterion which could 
Colle eloped, particularly since the corps of 
να ανν Examination Board readers 
lo an excellent source of raters who had 
and eRe experience in the evaluation of writing 
abilities also had classes of students with whose 
With thes they were well acquainted. By working 
Writer ese raters individually and intensively the 
We]]-. pected to secure a conscientious and 
v onsidered rating in each case. 
evaluat Construction of the test material to be 
(1) Th ed and analyzed was planned as follows: 
Availap Writer felt that much of the criticism of 
e qu le objective tests in English was valid as 
Une NM on such tests are frequently picay- 
9p som he writer proposed, therefore, to devel- 
normaly, objective questions to test the abilities 
ize the t Covered by such tests but to emphas- 
Éreatest Ls of question which would give the 
Testruct Possible opportunity for the student to 
ich ure the written material presented and 
uring goua come as close as possible to meas- 
Measur OSe attributes which essay tests aim to 
YPoth e. An aim of the study was to test the 
iene ο that such a test would be a better 
(2) Sin € of writing ability than an essay test. 
materi? the highly structured revision-ty pe 
inclug al had shown promise, it was decided to 
= τ. it here, Such test material presents the 
ing g ?C® With a definite and specific task, allow- 
of e readers very little latitude for difference 


S i " " 
Se uo om. It was desired to determine if this 
Objective essay-type question, which would 

d also be valid. 


Dro 
(3) Γκ be highly reliable, woul 
Whio, € Writer wished to discover th 
ilit! ἃ Purely «*verbal' test would measure a” 
al © write. The hypothesis was made that 
anq ar tests of writing ability, both objective 
and arg ctive, are measuring verbal ability 
Verb; Measuring it less well than the tradition- 
be i test measures it. The hypothesis to 
ould ed was that writing ability, insofar 35 it 
(4) p£ measured atall, is verbal ability: 


elat; 
Studied. tionships among all variables were tobe 


g 
7 
he General Study Plan 


Th ' 
DIM basic problem for investigation grew out 


e extent to 


HUDDLESTON 


171 


students were being evaluated for the guidance 
of college admissions and placement officers. 
Fortunately, this concrete situation was one 
which lent itself readily to providing large ex- 
perimental populations and generous coopera- 
tion from professional people in the schools and 
colleges and on the College Entrance Examina- 
tion Board staff. 1 The study, therefore, was 
designed in such a way as to take full advantage 
of the Board's facilities and of its established 
testing programs and procedures. 

The Board's largest testing program takes 
place in the spring of the year when secondary 
school seniors are examined for admission to 
college in the following September. The Eng- 
lish essay examinations are read and scored by 
a group of secondary school teachers and col- 
lege instructors who come to the Princeton lab- 
oratory for this purpose. The readers work 
under the general direction of the Committee of 
Examiners in English Composition. At any one 
session most of the readers are people with 
years of previous experience, although new 
readers join the group from time to time. The 
work is highly organized. The readers workin 
groups of about half a dozen, each group having 
a table leader whose responsibility it is to spot- 
check the work of the others and to confer with 
other table leaders and with the Chief Reader 
to insure the maximum amount of consistency - 
inevaluation. Ordinarily the first day is spent 
in reading and discussing “sample” papers in 
order to arrive at mutual agreement on the ap- 
plication of standards. Prior to this time the 
table leaders have already spent considerable 
time in similar practice. 

The study plan aimed to utilize the April 
program to provide an opportunity for compar- 
ison of different types of test material on a large 
population. Before this could be done, however, 
some preliminary evidence was needed to justi- 


fy the presentation for the first time of objec- 
tive test items in a Board examination in English 
Composition. Hence, à preliminary study was 
planned to obtain some validity data on the pro- 
posed item types and to test the effectiveness 

of the procedures proposed for use in April. 

As a method of attack on the problem of 
measuring writing ability, the two parts of the 
investigation are to be regarded as mutually 
complementary. Not every aspect of the prob- 
lem is dealt with in each of the studies, but the 

eneral conclusions are derived from a synthesis 
of the findings of both. 

The function of Study I was to construct the 
experimental test material, to develop a criter- 
jon measure of ability to write, to study the 


characteristics and interrelationships of the 


e m7 
Oncrete situation in which the abilities © 
1 


ET 
“ational esting Service. 


tm 
a former laboratory staff of the 0911966 Entrant 


e Examination Board is now a part of the 


. XXI 
172 JOURNAL OF EXPERIMENTAL EDUCATION (Vol 
= 27 
i j tudents N 
i es of material, and to judge whether Class 12 Average s BE 
ie Oe evidence justified the continuation Class 13 Average students N 
the investigation ona larger scale. Whi le UA meii 
τάν were fewer subjects available for Study 1, Institution E. Small eastern college for wo: 
hich utilized college freshmen the possible -23 
ting time was longer —150 minutes as com- Class 14 Average students - L24 
pared with 60 minutes for the English test in Class 15 Average students 
Study II. Thus, in Study I it was possible to use 


Class 16 Average students N = 26 
enough material to gain infor 
breakdowns of the objective i 


ployed the methods 


mation on several 


tems. Study Hem- The indicated classifications of students aS 


ili instituti hem- 
to ability were made by the institutions t i na 
Selves and thus can be expected to have a diffe 


en 
der problems ent meaning in each institution. In the 5 E te 
under investigation, classes nine instructors were PARUM. a 
Groups of classes taught by same Ὃ ads o5 
D. Study I Were: classes 1 and 4; classes 5 and 6; 4 
7 and 8; classes 11, 12, and 13; and classes 14 
(a) The Test Population 15, and 16. 


Sixteen college-freshman English classes "na 
participated in the Study, ranging in size from (b) Description of Variables 
22 to 33 students per class. The stude nts in 
these classes Were tested during October and 


1. Objective E 
November, Shortly after the be 


nglish Test: This test was ad 


ginning of their ministered in two Sections, —Section I, 50 ecl 
first semester of college study in English. Five utes, 149 scorable "Units; Section IT, 10 minu its 
Separate educational institutions Were represent- 32 scorable units, Since several scorable uni 
ed. The classes may be described as Shown be- 
low. 


may often be found in a Single sentence, this 


material moves Quite rapidly in the testing Sit" 
uation, 
Institution A, Large middle-western college, The leadi 
coeducational: 


ng concept in the construction n o 
this material was that the student should be gi 

Class 1 Average students 

Class 2 Above-average Students Ν- 27 

Class 3 Below-average Students Ν- 33 


Presented to hi 
Class 4 Average students 


Perience in working with other tests that objec” 
N=24 tive items can b 


€ highly rigid or highly flexible. 
Institution B, La 


| The emphasis on flexibility in the present κ 
3 rge middle-western university, was aimed at adopting as many of the desirab 
coeducational: Characteristics of the free essay μας, 
as was possible, Approximately forty perce 
Class 5 Above average students yy. 25 of the questions were culled from previous te i 
Class 6 Average students N-22 Constructed for special Programs under the — 
Class" Average students N - 24 oard’s jurisdiction; the remaining sixty per^ 
Class 8 Below-average Students Ν- 26 cent were tailor-made for the present investi 
M gation. 2 " 
Institution C. Large eastern university, coed- The items were designed to fall in four ge?" 
ucational: eral categories Punctuation, 27 items rao 
ic expression 33 items. 47 items; 
Class 9 Above-average Students ŅN = 24 and sentence Structure, jd po The oni , 
Class 10 Above-average Students Ν- 94 est flexibility was attained in sentence str e ij 
ο. Which involved Such poi llelism Οἱ "od 
Institution D, Large eastern university, male ence elements, M erpel araa misP gent” 
undergraduate college: and dangling modifiers, subor dination of pee 
wordin i i T 
Class 11 Average Students N - 29 πε θβρησες xs κ... stringy, is 


2. Of the new items, 


Bpproximate] 
one-third by Profe 


880r Scott B 


Y two-thirg e 

: Ίσα, 8 were constructed by the Writer and approximately 

8. There were actually 61 grammar ‘tens in the test, but 4 were ο 

i mitted fr it 
a8 found that this omission would, om scoring since 
S ΕΤΘ8115 reduce the Clerical labor involved in combining 


March, 1954) 


In 
(Σοκ the test consists of sentences and 
lined and nu τ. certain portions are under- 
ls accompa mbered. Each underlined portion 
Sible ways nied by suggestions of several pos- 
len. m ó in which that portion might be writ- 
Which of ig case the student is asked to decide 
Which sou e suggested answers is correct, or 
nds best in the sentence. The follow- 


NE exa. 
is ος will illustrate the potentialities of 


The 

su " ο. 

arded oe of any experiment is jeop^ 
y inaccuracy one error may in- 


vali : 
idate all the conclusions found. 
2 


l. (1) inaccuracy one 
(2) inaccuracy, one 
(3) inaccuracy; one 


2. (1) found 
(2) which are found 
(3) OMIT 


hed to go and return on the same 


yf 
or several reasons. 


3 


3. (1) (Leave where it is now.) 
(2) (Place at beginning of 
sentence. ) 
(3) (Place after “return, ”) 


2. 
o queda Questions: The three 20-minute 65” 
b Maia were proposed by the Committee 
ta Vegard ers in English Composition, and may 
ton Pe as typical of essay questions prev- 

b Suffi y them. The topics were intende 
T Sen y universal that all students would 
Ww Com ο. experiences upon which to draw. 
rare t = eted papers were sent to princeton 
spelar xd were scored by a group of the Board’s 
an, ndaras dete according to typical Board 
γος. Bani Each essay was rated for: material 

Pabula zation; spelling; punctuation; syntax; 

Y; and sentence structure. 
rded as 


Ws: ee essay questions were WO 


T 

In ab 

Paragr, t 150 words write a well-organized 

eq, > 4ΡΗ on the subject, «A Change i$ Need- 

Serious}, u may advocate any change which you 

Argum y think should be made. Support your 
ent with as much detail as possible. 


n 


Tor 
a 
R) ches tonal illustrations 898 appendix 1. 
ce 3; (2) choice 8; (8) choice 2. 


HUDDLESTON 


173 


IL Tne statement ‘‘All men are created equal" 
may be interpreted in several different ways 
because in the statement the word equal may j 
have several meanings. Explain with the 
help of examples several of the meanings of 
equal and indicate which of these meanings 
you think makes the statement most nearly 
irue. Your answer should be in the form of 
a well-unified paragraph of about 150 words 
that has for its subject what you think this 


statement means. 


ΤΠ. In a paragraph of about 150 words discuss 
a serious error which you think parents may 
make in the rearing of a child, and indicate 
the possible consequences of this error. Use 
your own experience or your observation of 
friends and acquaintances to support your 


opinion. 


3. Paragraph-Revision Test: This test was 
devised in an attempt to bridge the gap between 
the objective and free essay forms. Eventhe 
most intensive self-discipline by the readers 
leaves great latitude to the individual reader in 
topics such as those listed 
above. An important criticism of the objective 
test, on the other hand, is that it gives the stud- 
ent several a m which to choose 
instead of allowing him to write naturally. In 
the paragraph- i 
complete paragraph to 
to rewrite as à 
are planted in the paragraphs, whic 
may or may not recognize. 
ent to judge the correctnes 
demonstrate 
the reader the task is ease 
has a series of specific poi 


rate the student. There isa 
nses which are possible, 


inthe variety of respo. 
and the readers may agree in advance on the 
score value of each type of response. 

The two paragraphs used, under a total time- 


limit of 20 minutes, are 85 follows: 
In each of the paragraphs below 


Directions: 
you are to assume that the first and last sent- 
atisfactory as they stand, but that 


ences are 8 
the material between the first and last sentences 


needs to be rewritten in the interest of correct- 
ness and good style. In your answer booklet. 
rewrite each paragraph, making whatever changes 
ou think desirable in order to produce a smooth 
well-written, and well-organized piece of work. 


The answers to the above questions are: 


174 JOURNAL OF EXPERIMENTAL EDUCATION 


You need not keep the same sentence divisions, 
or the same order of presentation, but it is im- 
portant to preserve completely the origi nal 
writer’s meaning. Make no changes in the first 
and last sentences. 


A 

One of the most remarkable things 
about the Chinese is their power to se- 
cure the affection of foreigners. Nearly 
every single person who is a native of 
Europe likes China, those who both 
come only as tourists and those who 
live there for many years. In spite of 
the Anglo-Japanese alliance, I can’t 
hardly recall one Single Englishman in 
the Far East who liked the Japanese as 
well. The obvious evils are Strikingly 
obvious to whomever has just recently 


very prevalent, and the anarchy in the 
politics, which are also corrupt. The 
Tong desire to reform 


5 not, however, affect 
his love for the People, 


B 


For two days and nights the havoc 
raged unchecked through all the church- 


lages. There weren 


ever, the destruction was complete, 


The verbal items were anto- 
nyms, chosen from the Board’s Scholasti 


(Vol. XXII 


ficient to give a rough indication of what might 
be expected in Study II. 

The following items are similar to the ones 
used in the study:5 


i i ists 
Directions: Each question in this part consist: 
of a group of four words, two of which are ap 
proximately opposite to each other in ae 
Decide which two words in each groupare basen 
nearly opposite, and blacken the space benea m 
the Corresponding pair of numbers onthe ven 
Sheet; i.e., mark the space between the aite A 
lines beneath ‘1-2 if words numbered 1 an 4 
are opposite, beneath “3-4” if words 3 and iei 
are opposite, etc. Mark only ONE set of dot 
lines for each question. 


l. l-qualified 2-unfit 3-healthful aprima s... 
2. l-circumscribed 2-tedious 3-senile 4-1π! 
esting 


- -pro- 
3. 1-authentic 2-mechanical 3-spurious 4-pT 
ductive 


5. Instructor’s Ratings: The major criterion 
consisted of rating 


their student, 


Sider only the students’ ability to write the kind 


volunteered by thi 
told to him, The fi 


epo Ollowing points were emp 
1Zed in each inter “5 


view: 


1. Do not judge 


n 
: s 
FE : the students’ '*imaginativene? 
Or *'Creative ability, » 
5. Since the actual items used are the confide: 


9 Bulletin of 


Information, 
1-2; (2) 2-4; (8) 1-8. 


The answerg 


Samples offered 


of the Board, the 
to the above Questions are: (1) 


e ab 


March, 1954) 


2. In 
Insofar as possible, eliminate the intelligence 
or from the rating. 


ee consider factors other than writing 
eradan which ordinarily go into English course 
teacher . Try to rate as if you were nota 
μα For example, if you are making a 
class r effort to teach punctuation to a 
the mE a will give a low course grade to 
teachi ene who does not learn what you are 

5 ene however, in making these ratings, 
too Un not to let any single factor carry 
ation su Meses Eliminate from consider- 
prompt ο things as behavior in class, effort, 
ficie ness in handing in assignments, pro^ 

ncy in literature, etc. 


D 
Cr . attempt to predict the results of the 
these e B. English Compostion Testor of 
ent a oe tests. Do not give a stud- 
o his τω rating because you know he will not 
ut bag est work in an examination situation, 
he εἰπῶ your rating on the assumption that 
ment Paci is handling a normal-lengthassign- 
ing on TER ideal conditions. Base your rat- 
zen ie stadentisability to do sustained 
ability t or an hour or two—and not upon his 
the c p dash off a twenty-minute theme of 
the pur «Ε.Β. type; the reason for thisis that 
tent to Pose OE the study is to measure the ex^ 
Cate a en the C. E. E. B. test scores indi- 
Situati e students’ ability in normal writing 
ions, 


r the instruc- 


The ; 
torg 188€ ratings were obtained afte 
lass work 


pith t τ had about three months of c 
seven dents concerned. The i 
Previous el samples of each student’s work on 
"x ental t Writing assignments (not the experi- 
s these ests) and they were encouraged to refer 
Ires, 5€ liberally whenever they wished to re~ 
μι... μονὴν memory regarding à particular stud- 
rmance 
. Two di s 
First. different sets of ratings W 
nr τ instructor simply placed the students 
rder relative to one another. After he 


don, 3 
Dery stage he was then asked to compare 
A that ent ina class with every other student 
ther the first- 


seed Peel stating each time whe 

Sere -na ent was better or poorer than the 

is hteq ge student; every pair was later pre- 

Dal beta d with th^ two students being named 

ti rison p Order. Each student's paired-com- 

US he UE was taken as the total number of 

iy dent, d been rated as better than another 

bap der ae two types of ratings Were desired 

sare Esos discover the extent to which the 

Ig Re res mparison technique would yield the 

t the mo ults as the simpler rank-order method. 
Pa Aw elaborate method were found to in^ 

© significant changes in the ratings, 


ere obtained. 


HUDDLESTON 175 


then the tedium of that painstaking 
be dispensed with in Study I. λα ci 
6. English Course Grades: At the end of the 
semester, each of the instructors supplied a 
copy of the final course grades for each class 
used in the study. This secondary criterionwas 
chosen because the English Composition Testis 
intended to function as a measure of pastachieve- 
ment in English courses and as a predictor of 
probable success in future English courses. 
Thus the correlations of the predictor variables 
with course grades were considered to be of in- 
terest despite the extraneous factors which en- 
ter into classroom grading. 


(c) Results 


In order to reduce clerical expenses it was 
decided to drop Section II of the Objective Eng- 
lish Test from the analyses which follow. The 
total Objective English Test score will, there- 
fore, be represented by the Section I score. The 
149 items in Section I included 27 punctuation 
items, 20 idiomatic-expression items, 47 gram~ 
mar items, and 55 sentence-structure items. 


For analyzing the different types of objective 
the subscores incorporated 13 


items, however, 
items and 15 sentence- 


idiomatic-expression 
structure items from Section II. Accordingly, 
the English subscores represent: 


27 punctuation items (“P”) 

33 idiomatic-expression items (‘‘I’’) 
47 grammar items (G7) 

10 sentence-structure items (“S”) 


te is also in order regard- 
he class N's in the tabular 


data which follow. Not all students in eachclass 
took all the tests, which were presented on dif- 


ferent days; additional drop-outs occurred be- 
s did not receive end-of-sem- 


cause some student: 
In the Paragraph-Revision 


ester course grades. 
Test, some students made no response to Par- 
agraph B. Furthermore, the instructor inClass 


09 was unable to allow time for Section II of the 
Objective English Test, and the instructor in 
Class 10 omitted all of the objective English 
and verbal material The writer decided that 
it would be desirable to preserve the maximum 
number of cases for each intercorrelation, even 
though all intercorrelations for a class would not 
be based on the same N. For each intercorrel- 
ation, therefore, the corresponding N is report- 


An explanatory no 
ing the reporting oft 


ed. 

1. Levels of Performance: Table I gives the 
mean and standard deviation on each variable 
for each class. Inspection of these data indi- 
cates that the difficulty ranges of the various 
measures were sufficiently great to allow for 
discrimination within the classes 25 well as to 


1. XXI 
116 JOURNAL OF EXPERIMENTAL EDUCATION (Vo 


TABLEI 


MEANS AND STANDARD DEVIATIONS OF ALI, VARIABLES (BY CLASS) 
(N) - Number of Cases 


Objective English Test 


Secti “P” Score * T" Score 
Class ection I 


Hower IO IO [ο 


11 103.17 20,56 29 19.24 Ser 5 | fassus 


N 

o 

co 

-1 
Prep. Mig ΟΣΑ. 


3.31 29 19.31 62 29 
12 107.85 16.23 26 19.23 2.97 26 20.12 .02 26 
13 107.30 16.68 30 19.57 3. 45 30 j 87 30 
14 121.23 11.42 22 21.95 2. 60 22 22.18 23 22 
15 123. 04 8.19 23 21.48 2.90 23 22.13 94 23 
16 118.29 10. 78 24 21.38 2.32 24 21.71 68 24 


“Ὁ” Score 


Essay Total 


3 i 7.53 i à 

03 29. 85 6.70 33 Shae ne : 
: .48 8.21 32 
04 33. 13 5.79 — 24 | 44/33 6.91 24 35:57 Ei 2 23 
05 33. 38 5.31 — 24 | 42.95 8490 24. | 4315 7.05 35 
06 23, 41 5.98 — 22 | 31.27 5.95 — 22 | 81 68 8.70 19 
07 27. 00 5.86 23 | 37.55 ToL 28 | 3506 9.388 21 
08 19.13 3.85 24 26.58 4.22 24 22. 33 6.39 24 
09 39. 33 8ή — 84 |“... bis P 51.33 9.45 24 
LL Lan étes 4 
11 32.28 7.68 29 | 4442 8.42 39 dd Ἢ Ea 25 
12 34. 73 5.51 26 | 46 3] GO ΒΒ NEG 9.96 26 
13 34, 20 5.96 430 | 45/19 125 830 | at πα 10.59 239 
14 38. 86 3.33 2» | 5141 $34 — 29 | Aj 2s 8.13 22 
15 39. 70 "AME BE EE 23 | 48.57 8.79 38 
16 ο. M ME. E | μεν, 10.68 24 


March 
y 1954) HUDDLESTON 


TABLE I (Continued) 


Class μα” μαμα ο Lim NO E 
|J uo SE n. 
0 12. 96 3.83. 25 | 14.84 46: 25 | 13.24 5.21 25 
S 14. 22 424 2T | 15.88 5,37 26 | 14.33 4.29 27 
04 11. 85 4.31 33 12.88 6. 06 32 11.15 4.95 33 

11.83 421 23 | 12.83 474 23 | 10.91 4.19 23 
05 14. 08 4.08 35 | 14.48 516 25 | 14.52 4.66 35 
06 10. 63 4.32 19 | 11.80 4.00 20 8.90 4.06 20 
ΟἹ 11.36 3,72. 22 | 12.50 5.02 24 | 11.32 431 23 
08 1.84 3.15 35 8. 04 3.82 25 7.04 3.05 36 
09 16. 92 446 24 | 17.17 449 24 | 17.67 4.91 34 
10 14. 88 495 24 | 18.00 sos Δὲ |1604 B2 24 
11 11.68 418 25 | 15.52 $a ss |i ΒΩ τ. 
12 15.38 5.23 26 | 19.27 τω 3 um ΕΠ 26 
19 14, 17 4.80 59 | 15.14 473 29 | 12.28 3.75 329 
14 16.18 417 33 | 16.95 Kus 3 155 5Β 22 
15 16. 87 411 23 | 16.52 3.55 23 | 15.35 4.05 33 
16 16.17 4" 24 | 11.11 aog » ΙΒ 24 


Paragraph A Paragraph B 

n 6,58 25 | 31 4. 18 

02 oo a 164 26 | 32-22 4 2E | δ 2 

E 35. 73 aol 33. | 27.98 7.58 33 &35 83.365 81 
04 40. 13 $23 33 |9117 5.45 33 296 3.565 23 
ΒΒ 43. 08 Τ.Σ 24 |328 9 $ asus bb Β 
pe 28. 30 Tao 20 | 25.95 SA | 5 2t 1 
Fu 41. 04 5.36 23 31. 88 4.78 24 8. 88 2.55 24 
ο gee oar A |. aTe 38 | 658 3.4Ἱ 18 
o Qi υπ BILE 5.57 24 120 3.19 34 
ο ger am 26 | Be, 5.27 24 |Ο 3.59 34 
NH um απ 8€ [uo $90 25 | 848 3.28. 21 
12 42.58 $68 36 | 34.00 476 36 9.29 — 2.19 24 
13 40. 41 376 29 | 31.12 7.32 29 9. 00 4 28 
hoo woa — P on|u4 . 2^ 22 |1086 317 22 
bo Ἢ επ E NU δ 23 | 1068 2.55 22 
ie 39.70 11.62 23 31.29 8.22 24 10. 00 3.19 21 


JOURNAL ΟΕ EXPERIMENTAL EDUCATION 


TABLE I (Continued) 


Instructors’ Ratings 


co co 


PR © Oop ο0 μν το Oop 


Course Grades 


(Vol. XXII 


March, 1954) 


Show 
Ud μας differences. Foreach 
Scale, ex Scores reported are on thesame 
merical cept for instructors’ ratings (the nu- 
er of Pr ον of which depends onthe num- 
Which a ents in the class) and course grades 
Same in EH comparable only for classes in the 
These ee 

Significanon Ge and standard deviations are of 
Criteria ο. in interpreting the correlations with 
Striction 1 ich are given in Tables VI - X. Re- 
end to domm range for particular classes will 
Ought not to cre the validities generally, but 
observed Ὁ obscure the over-all relationships 

e true MP is to be expected, however, that 
ents would idities for the entire group of stud- 
Stricteq be higher than those reported for re~- 
eviseq fom (In Study H a technique has been 

Thes r combining groups.) 
Means es era may also be compared with the 
Siven in T standard deviations for 294 cases 
Shows th able ΠΠ. Such a comparison readily 
tion to the standing of a particular class in rela- 
are BEGBSS larger group. For example, there 
the gene ably no students in Class 08 who reach 
English cies mean for Section I of the Objective 
Would be est, while most students in Class 09 
- mau the general mean. 
: S ability of Predictor Variables: There 
Judging ere of indices which are helpful in 
first ig pre reliability of essay material. The 
he “reader reliability, ” the extent to 


are 


Which 
dom con agree with one another. A ran- 
Teadi ple of papers was selected for a το” 


and ith of the essays and paragraph-revision, 
Second relations were run between the first and 
‘readen ο Of scores. These correlations are 
I τ reliabilities” and are reported in Table 
Ompas Peliability for the total ese 

fir d by summing the scor 


Biven = III, summing the corr 
o Sets a second reader, and € 
three e of summation scores. 
5ὲ Yengo TS was assigned to a 
resents t? so that each summa 
be Te-r hree readers.) Papers 
and wer d had the initial score sheets removed 
Way so t distributed to the readers in the usual 
reading hat no reader knew when he was “rpe- 
are seg, In Table II the reader reliabilities 
eweq n to be low, and this finding must be 
labis, light of the fact that the true test re- 
*eader 85 must necessarily be lower. he 
Ure reliability may be thought of as 3 meas- 
ests ^ Coring accuracy, which for objective 
Presumably close to perfect. ) 


Th 
he jug; E ther way in which essay reliability may 
eye, d is to evaluate the intercorrelations of 
sonable 


Tal 
essay questions. This is rea 


he Estimati 


? 
* Le 
Ayard R, Tucker. "A Note on t i ) 
y (1949), PP- 


or 
mia (30), Psychometrika, XI 


HUDDLESTON T 


when the essay questions are intended to meas- 
ure the same abilities, as was true inthe pres- 
ent study. In Table III, the intercorrelations 

of the three essay questions are found to be .41 
.41, and . 32. If the Spearman-Brown prophecy 
formula is applied, and the assumption is made 
that an essay test has three 20-minute questions 
each with a correlation of . 41 with each of the ᾿ 
other two, then the estimated reliability of the 
total test becomes . 68. Such an estimate is the 
fairest available indicator of the true reliability 
of the essays used in this study. 

The Paragraph-Revision data indicate that 
high reader reliability may be attained with this 
type of exercise, although the intercorrelation 
of Paragraphs A and B (.33) is withinthe range 
of the intercorrelations found among the essays, 
and thus does not promise a higher test reliabil- 


ity. 
Table II shows the estimated reliabilities of 
the objective English and verbal material. The 
Kuder-Richardson formula (21) which was adopt- 
ed here typically underestimates the reliability 
which would be obtained if formula (20), involv- 
ing individual item-difficulties, were. used. Ac- 
cording to Tucker 7 the results for formula (21) 
may be as much as ten percent less than those 
for formula (20). For the purposes of Study I 
as a preliminary investigation this briefer meth- 
od was regarded as adequate for checking onthe 
reliability of the objective English and ve rbal 
material. 
The 50-minute objective English section 
shows à conservatively estimated reliability of 
.93as compared with the possibly generous es- 
timate of . 68 for 60 minutes of essay questions. 
Of the four categories of objective English ques- 
tions, the grammar and sentence-structure it- 
ems show the highest reliabilities. However, the 
«pr and ‘I’? groups are much shorter than the 
«gq? and “S.” If the “P” and “I” were doubled 
in length, their lengths would be greater than 
«Gq and less than the length of 


the length of 
«8,7 The Spearman-Brown prophecy formula, 
when applied on the assumption that “P”? and 


«17 are doubled, yield reliabilities of .73 for 
«p? and . 53 for «p^ The reliabilities for the 
«p? and CT" categories are less than satisfac- 
tory, but they compare favorably with the essay 
material in the present study. The reliability 
of .85 for the short verbal test indicates that 
these items are performing here with the con- 
sistency which other investigators have typically 
und in verbal material. i 

3. Relationships Among Predictor Variables: 
The intercorrelations in Table ΤΠ are of value 
in shedding some light on the possible compon" 
ents of the abilities tested by each of the pre~ 
dictor measures. For this purpose, lack of 


fo 


on of Test Reliability by the Kuder-Richardson 
117-119. 


JOURNAL ΟΕ EXPERIMENTAL EDUCATION 


TABLE II 


RELIABILITY INFORMATION ON PREDICTOR 
VARIABLES 


Estimated Reliabilities of Objective Measures 


Estimated 

Variable Reliability * 
Objective English Test 

Section I . 93 
“P” Score .57 
“I” Score . 36 
“G” Score . 82 
“S” Score .81 
Verbal Test .85 


Computed Reader Reliabilities for Essays and 
Paragraph-Revision 


Reader No. Papers 


Variable Reliability in Sample 
Essay Total . 78 


122 
Essay I .67 129 
Essay II .4T7 138 
Essay Ir .67 136 
Paragraph A .83 89 
Paragraph B .59 


*These estimated re 


liabilities Were computed 
by Kuder-Richards 5 


on formula (21): 


(Vol. XXII 


181 


HUDDLESTON 


March, 1954) 


pe ME NR eee RR os ELE 
8£'9 916 (44. 18'L 


18 ατα μι ας 906 IT ος 00'072 uonerAeq 


prepujg 
n HS ο καν μα Εις REA. In EE e oco eg 
TILT 0566 τοτε  — I89'0P 06701 


τε 
££ 


£6 


9v'Pl θε ταν 62h GIS ZTO% 565: — 99'POT wee 
Le 96 e Le τα 39 19 τα LY 29 1591, Τ60ΙΘΛ (ET) 
19 6ζ [44 16 τε τε ας 8z ος ας g udei3ereg (51) 
£6 LZ ez ££ 96 ας νε τε LZ 95 v udersered (TI) 
56 LZ ας Iv OF 12 96 £€ [4 TAOL UOI 
-Slsoy-ydersered (01) 
55 σε ΤΡ vL Sb ep OF 66 n m Aessa (6) 
LZ σε I LL 9v [17 τν 66 9v 1 fessa (8 ) 
ας ΤΡ Hi 8L ey 6£ ας se [7 1Aessu (a ) 
T% vL LL 8L 96 98 08 6v 09 TAOL fessa (9 ) 
of Sy 9v ey 8s v8 £L OL v6 91056 «S» (8) 
IP (32 8v 66 96 y8 6L 0L v6 exoog 49» (v ) 
9€ ov τν ας og EL 6L £9 v8 eao)g «1»; (6) 
££ ας 66 86 [7 OL 0L £9 18 euoog «d» (Z ) 
e Ly 85 £v 09 v6 v6 v8 18 luonoeg 15931, 


usr2ug eAnoelao (τ ) 


(66 = N) 
SASSV1IO TIV UAAO SX'IHVIHVA HOLOIGIUd AO SNOLLV'ISHHOOHS.LNI 


ΤΙ 4'Id VL 


JOURNAL OF EXPERIMENTAL EDUCATION 
182 


j is as significant as its presence, 
ου Paragraph-Revision has fa irly 
low and almost uniform correlations with all 
other variables; it is apparently influenced by 
factors somewhat unrelated to abilities tested by 
the other measures, Furthermore, the correl- 
ations between Paragraphs A and B are lower 


than the correlation of each with certain other 
variables. 


The correlation between Verbal and 
S .67 when like wise 


Verbal. 

The «p, 2} SL 1 ας, 22 
higher relationships with 
other variables, The fac 
scores Correlates more hi 
than with Verba] indicates that the Objective Eng- 


lish Test is measuring some trait other than 
verbal ability, 


4, Reliability of Instructors: 


d-com- 
The instr uctors 


obtained among this highly trai 
tive group of instructors. The 


TXX VI'yy 


9, Charles 0. Peters and Walter R. 


(Vol. XXII 


consistency of the two methods with these raters 
led to the conclusion that simple ———— P 
might be used in Study II where somewhat 
interviewing time would be available. — 
9. Relationship Between Criterion is e 
In Table V the correlations between instruc : 
ratings and course grades are seen to ran T g 
from . 44 to, 91, witha median r of . 85. a 
Spite the attempt to “purify” the ipia ye high 
not surprising that the correlations are t ορ 
Since the course grades are all in apa ces ar 
position. The correlations are low enough, ne 
ever, to reward the attempt to separate ral 
types of criterion. It is noteworthy that t P 
correlations in Table V do not tend to be as " 
ated with the lower reliabilities re ported 
B x NM Validities of Predictor |i aane 
The correlations, by class, of the predi n 
lables with instructors? ratings and with oe 
grades are presented in Tables VI-X inclus 


es 
In view of the large between-class differenc 

in ability, 
timates fr 


the worthwhile 
tion. 


ur principal predictors appen, 
In predicting instructo rb- 

English Test and the V€77. 

best with median r’s οἵ α΄ 


is close to Verbal with es 
agraph-Revision remai 
τ Gt " 28. rs’ 
With respect to the criterion of instructo 


e” 
ratings, the effectiveness of the individual P of 
dictors may be furth 


Superior. While the 
jective English Test -active 
Test (. 43 as compared with .94), the Objec 
English Test neverthe] 


Van Yoorhig. Statistical Procedureg and tical 
Bases (New York: McGraw-Hill, 1940), p. 157. Their Mathema 


183 


HUDDLESTON 


March, 1954) 


59569 Jo requinN = (N)x 


G8 ' I 
ΠΌΤΡΘΙΝ 
GC 88° 9T vc pY ` 80 
Ῥό 98° GT £6 G8 ᾿ L0 
£6 88° PI 66 68 ` 90 
0€ I6" eI τσ 08 7 G0 
L6 98 ` GI LA l2: pO 
66 68" II 56 G9" 50 
Τό 88 ' 01 96 Sh ` σ0 
£6 69° 60 ασ σ8 ` 10 
(N) J ‘ON (N) 1 ‘ON 
SStID SStIO 


(SSV'IO Ad) SHAVUD ASUNOD ANV 
SONLLVH .SHOLONULSNI NAAMLAJ SNOLLV'I3HH02 


A N'IdV.L 


59569 10 1equinN = (N)« 
MR 


jët J 
ΠΈΤΡΘΙΝ 
ο... We 
LZ [ΗΝ 9T 9c G6 ` 80 
IK 96 GT GZ 66 LO 
GZ 86 ' val ζζ 16" 90 
0€ G6 ' eT LC 86' GO 
8% 66° ZI vt 66' v0 
16 16 IT ος 66᾽ 90 
ασ 86 ' OT LZ 98 ' ζ0 
9c 86 ` 60 9c L8’ 10 
(N) 1 'ON «(N) 1 ὋΝ 
55ΈΩ Sse 


a a uw. I.É aaa LL 


(SSVT χα) GCOHLAW SNOSIHVdWOO-G3UHIVd AG 
CANIVLEO SDNILVH ανν GOHLSIN YACHO-ANVY 
χα GANIVLEO SONLLVH NAAMLAG SNOLLV'IdHHOO 


AI d'IG VL 


1. XXI 
JOURNAL OF EXPERIMENTAL EDUCATION (Vo 
184 


TABLE VI 


CORRELATIONS OF PRINCIPAL, PREDICOTRS WITH INSTRUCTORS’ RATINGS (BY CLASS) 


Objective English Essay 
Section I Questions 


26 52 27 

03 66 33 21 32 21 33 .68 33 

04 .67 24 .15 23 .18 23 .26 23 
05 -.20 24 -. 06 25 -.11 24 . 09 25 
06 .54 22 .34 19 .57 20 .54 21 
07 . 43 23 .53 21 .25 23 . 67 24 
08 .48 24 .15 24 27 24 -.13 26 
09 .42 24 .63 24 .35 23 .45 24 
10 κών " .40 24 .35 24 ώς, » 
11 . 09 29 .02 25 -.07 25 . 05 26 
12 .23 26 .35 26 «41 26 .36 26 
13 .28 30 .24 29 .30 29 .43 29 
14 .34 22 .43 22 .55 22 .18 22 
15 .28 23 «29 23 .55 23 .43 23 
16 .61 24 .29 24 .18 23 .36 23 
το «48 «84 .9ῃ .43 


*(N) = Number of cases 
** Excluding Class 10 


^ Marc 
h, 1954) HUDDLESTON 


TABLE VII 


CORRELATIONS OF PRINCIPAL PREDICTORS WITH COURSE GRADES (BY CLASS) 


Objective English Essay Paragraph- Verbal 
01 .53 25 .60 25 .15 25 .53 25 
02 . 60 26 57 26 34 26 .47 26 
03 .10 33 .16 32 23 33 64 33 
04 .61 24 .38 23 .13 23 33 23 
05 -.13 23 -.29 24 -.24 24 -.17 24 
06 55 22 .24 19 .58 20 .53 21 
07 JT 22 .63 20 -.02 23 .51 23 
08 «34 98 . 46 23 .33 24 -.11 25 
09 .28 23 57 23 -. 06 23 59 23 
10 «es B 33 24 .32 24 ic E 
11 .19 29 .16 25 -.09 25 .22 26 
12 .15 26 .45 26 35 26 .38 26 
13 .26 30 .36 29 .37 29 .39 29 
14 «41 22 .49 22 .65 22 .02 22 
15 .30 23 .31 23 .58 23 28 23 
16 37 23 56 23 . 03 23 .28 22 
Median 
46 23 38 


UD - Number of cases 
Excluding Class 10 


186 


CORRELATIONS OF OBJECTIVE 
RATINGS AND 


JOURNAL OF EXPERIMENTAL EDUCATION 


ENGLISH TES 
COURSE GR 


Correlations with 


ADES 


T SUBSCO 


(Vol. XXII 


RES WITH INSTRUCTORS' 
S) 


(BY CLAS 


Correlations with 


S 


. 33 


Eoo Instructors: Ratings Course Grades 
No p! I G?  g! (N | p I G 
01 . 38 11 . 48 . 45 25 (9T o Νί . 62 
02 . 62 16 «90 . 46 27 +53 . 26 «92 
03 . 42 59 s53 . 70 33 «Ὁ . 49 «Βα 
04 . 47 08 . 03 «35 24 . 36 47 . 48 
05 7.91 - 9η -.08 -08 24 7.15  -.14 . 03 
06 "o. 28 oe by! 22 . 45 . 16 . 98 
07 . 18 49 . 49 . 39 23 -. 06 . 32 .24 
08 . 36 04 . 63 . 26 24 .41 -09 . 30 
09 . 48 ο 24 .99 . 08 
11 κ Is 03 T s21 29 . 04 . 09 418 
12 .21 11 . 18 . 30 26 «20.  -.99 . 07 
13 . 19 32 . 19 . 32 30 Le . 38 EG 
14 «13 62 . 40 15 22 . 20 ol . 48 
15 96 - 18 . 29 ε d 23 «249. -.01 . 28 
16 20 47 οι . 99 24 . 14 . 29 . 35 

Median 

r? .29 - 22 . 49 . 36 . 25 .41 E. 

= Ps Punctuation; Ia Idiom; G = Grammar: S = Sentence Structure 

i = Number of cases 


Excluding Classes 09 


and 10 


187 


5986Ω Jo 1aquinN = (N) x 
III £essH = III e 
II ess = II e 


HUDDLESTON 


I AeSSM =I; 
ce" vo rA 9e" ht * vo J 
ΠΈΤΡΘΙΝ 
ες 89’ ες 87" ες jz vc 9€ YO ΙΙ) vC — LO’ 9T 
eec IT" ες ο] eo — Te ες  *0"- εἷς ος ες 60 GI 
ec p σσ 90" zg 08" ει g ze ατ' eo τ᾽ pI 
62 62° 62 στ 62 αγ 6z og 6z 00° 62 og" el 
ος gp’ og OI ος LZ’ og t^ ος 0’ ος σσ eT 
ez;  90'- G2 60° ασ τε σσ 6t- GZ 60° ας 60^ ΤΙ 
7 pe’ ντ | LS vc το νο ος pZ «ος bZ co- ο 
ες ος ες γε ες ο τσ ϐς v? he τσ p 60 
ez GO- τ 60^ PZ «ες ος 02° ασ οἱ) ες ϐαοϱ- 80 
σσ Tt ες GP’ I? στ ες €^ τσ ϐς τ ος LO 
ος m“ ος 80^ 6t ο ος στα ος SI^ 6I 81᾽ 90 
PZ οσοι - τε 22° 2 εν -ἰ ᾱ ο ce $i ασ b- ο 
ες gg“ ες οὐ ες er' ος Τ᾽ ες vl ες το- το 
ες gg“ c£ | 69^ ££ ο) ες Lv Ze 65᾽ ee ο ε0 
ος 8p’ ez  TI' ος 8gp’ ht — Ie ος T LZ 96° Z0 
GZ τ᾽ ας 06] ες τε oez O09 ασ Lv ec Sb 10 
(N) i (N) Il (N) I (N) m N) ντ *(N) 1 'ON 
S9peEIr) Θ851ΠΟ09 SSUT]eY ,S10jONISUT eu 
UjIA SUOT]*[9.110;) 


ΠΙΑ SUOT}ETIIIOD 


March, 1954) 


(εσνπο Ad) SHAVUD ASUNOD ANV 
SONLLVH .SHOLONULSNI HLIM SNOLLS400 AVSSA πνραΙιΔΙαΝΙ JO SNOLLV'I4HH02 


XI 4'IGV.L 


188 JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE X 


CORRELATIONS OF INDIVIDUAL PARAGRAPH-REVISION QUESTIONS 
WITH INSTRUCTORS’ RATINGS AND COURSE GRADES (BY CLASS) 


Correlations with 


Instructors’ 
Course Grades 
Class 

No. (N) B (N) 
01 27 25 37 23 13 25 21 23 
02 26 27 sd 24 32 26 14 23 
03 06 33 .39 31 18 33 27 31 
04 16 23 «12 23 21 23 1) 23 
05 = 14 25 -. 03 22 =B 24 -. 28 21 
06 44 20 .48 7 59 20 .32 7 
07 10 24 «84 24 -. 19 28 30 23 
08 04 26 .27 16 17 25 45 14 
09 23 23 .44 24 -.07 22 11 23 
10 40 24 .14 24 38 24 10 24 
11 -. 09 25 .28 21 -.04 25 . 23 21 
12 58 26 -. 05 24 58 26 "02 24 
13 37 29 . 00 28 47 29 ~. 06 28 
14 50 22 - 43 22 61 22 .51 22 
15 51 23 ΠΠ 22 52 28 . 62 22 
16 24 24 «35 31 


ΣΑ- Paragraph A 
7 B= Paragraph B 
(N) = Number of cases 

i Excluding Classes 06 and 08 


(Vol. XXII 


March, 1954) 


eve 
idity dt m: Verbal Test, also with a median val- 
est in 16 , Showed itself superior to the Essay 
Objective s the 15 classes and superior to the 
one class) est in 8 of the 15 classes (equal in 
that the V . Thus there is a strong indication 
Predictor erbal Test may be the most dependable 
On th of instructors’ ratings. 
&rades M hand, with respect to course 
Predictive e Verbal Test drops to third place in 
oth the QUUM. The Essay Test surpasses 
Test in 10 jective English Test and the Verbal 
English Τι of the 15 classes, while the Objective 
the 15 , surpasses the Verbal Test in 9 of 
Ssay | een (equal in one class). Thus the 
Predictor appears to be the most dependable 
or in es course grades although it is infer- 
Crepane dicting instructors’ ratings. This dis- 
an init the validities of the Essay Test may 
ener νο that it has a language-a bility 
It is in addition to the verbal component. 
Who Seer that in this study many students 
Sis of rin the benefit of a high rating on the 
in the κο best work were unable to main- 
Over-an me rank in class on the basis of their 
est on ομως τμ Inthis case, the Verbal 
Caþabiliti appear best in estimating the highest 
ls more "à of each student, while the Essay Test 
Performa, osely related to the students’ average 
ον te composition course. 
Wired to € noted that freshmen are typically re- 
Senial to LUN on a number of subjects uncon- 
T Ve great em, whereas more advanced students 
The two er opportunity for personal choice. 
ing aga, major criteria, with different underly- 
ing Boe da ons, may both be desirable depend- 
ion he Situation in which they are used. 
B "edicto, lative values of the subscores for each 
inT variable are also of interest. The 
&rag Table VIII show that for both criteria the 
Ost vali and sentence-structure items Were the 
(fae id items in the Objective English Test. 
est, 2 Stammar alone did as well as the total 
nd se : 
ntence-structure did almost as well. 


ft 
the hi he three essay questions, question III had 
h criteria, and 


predicting course 


Choa t 
las. 4 08 because so few students 1 


[e 
©nclusions 


thes δε nen , 
dest Cg reliabilities of the Objectiva English 
e gees) and the Verbal Test (. 85 for 30 items) 


iciently high for satisfactory measure- 


HUDDLESTON T 


ment of college freshmen. In actual i 
the Verbal Test would be longer and eae m 
reliable. 

2. The reader reliability for the Essa 
(. 78 for three essays combined) is too ue Pea 
satisfactory measurement. The test reliability 
for Essay Total was estimated at . 68. 

3. The Paragraph-Revision Test showed a 
high reader reliability for Paragraph A (. 83) but 
a lower one for Paragraph B (.59). The data 
indicate that there is a potentiality, at least, for 
satisfactory reader reliability in Paragraph- 


Revision. 
4. The relationship of Paragraph-Revision 


with all other variables is low and indetermin- 
ate. 
5. The Objective English Test, the Verbal 
Test, and the Essay Testare measuring the verb- 
al factor chiefly. In addition, the Objective Eng- 
lish Test and the Essay Test may have another 
element in common— presumably achievement 
in the handling of language. 

6. Rank-order ratings by instructors show 
sufficiently high reliability (median r of .9 with 
paired-comparisons) to justify using them in 
Study II. 

7. Instructors’ ratings and course grades are 
highly correlated (median r of . 85). 

8. Except for the fact that Paragraph-Revis- 
ion shows consistently the lowest validities, no 
definite conclusions can be made regarding the 
relative validities of the principal predictors. 
The Verbal Test showed the closest relationship 


to instructors' ratings while the Essay Test was 
the best predi 


ctor of freshman English grades. 

9. The grammar and sentence-structure it- 
ems in the Objective English Test showed high- 
er reliabilities and validities than did the punc- 
tuation and i 


diomatic-expression items. 
10. Essay ΠΠ appeared 


to be the most valid 
of the essay questions, and was one of the two 
most 


reliable of the essay questions. 
11. A continuation of the investigation is need- 
ed, in which a larger group of students may be 
used and in which criterion data may be made 
comparable over all classes. 


E. Study Π 
(a) The Test Population 


Forty-four groups of secondary-school sen~ 
iors were identified who took the College Entrance 
Examination Board’s April examination in English 
Composition and whose teachers were in Prince^ 
ton as members of the corps of readers for that 
examination. All possible groups of at least 13 
students, whose teachers were in Princeton, 
were utilized. The groups ranged in size from 
13 to 26 students and represented 30 schools (one 
teacher from each school); there were 21 private 


190 JOURNAL OF EXPERIMENTAL EDUCATION 


ith 33 groups of Students and 9 public 
pared "t put pem The selective factors 
operating in favor of the private Schools were: 
(1) certain private Schools tend to train m ore 
C. E. E. B. candidates per school than do public 
Schools; (2) private Schools are more interested 
than are public Schools in releasing their teach- 
ers for duty in Princeton as readers. 


(b) Description of Variables 


Α one-hour examination in English Composi- 


tion was administered Which consisted of three 
parts: 


6 punctuation items 

T idiomatic-expression items 
16 grammar items 
16 Sentence-structure items 


Ina Single paragraph discuss a Serious 
error which you think parents 
in the rearing of a child, andi 


Support your Opinion, 
discussion can be pres 
graph of about 150 words, ) 


ing, punctuation, Syntax 
tence structure), 


(Vol. XXII 


(3) Paragraph-Revision: The Sa 
vision material from Study I was reproduce 
here without change, . 

Many of the students who took the xi aom 
Composition Test also took the τῳ; 
tude Test, thus providing as a fourth predi: 


(4) Score on Verbal Sections of the ο... 
tic Aptitude Test: The verbal material pes 
pied 100 minutes of testing time, with four 
arately timed parts as follows: 


Part 1—30 minutes; 65 items 

Three item-types: 

Analogies, 25 items 

Antonyms, 20 items 

Sentence completion, 20 items — 
Items are arranged in groups of 5, and τ 
types are rotated (e, &., 5 analogies, 5 a 
nyms, 5 sentence completions, etc.) 


Part 2—30 minutes; 30 items 


: istin 
Reading-comprehension material consisting 


of 5 paragraphs each followed by questions 
based on its content. 


Part 3—25 minutes; 65 items 
Two item-types: 
Antonyms, 40 items 
Analogies, 25 items 


Groups of 8 antonyms are alternated with 
8roups of 5 analogies, 


Part 4—15 minutes; 10 items 


ch 
Two reading-comprehension paragraphs e7 
followd by questions based on its content. 


mean was 495 and thi 


ΟΥ convenience in working with the statistica, 
for the experimenta] group the standard sco 
were divided by 10. cri” 

The fifth and sixth major variables were 
erion variables, as described below: 


their Students’ writi 


ability and it was SU£^ 
gested that the vi pd 


: r- 
Y familiarize themselves tho 


10. For sample items see Appendix I, 


11. For illustrative items seg Appendix ll, 


Cri; 


March, 1954) 


ο Ρ 
" er σε the work of those students to be rat- 
ως teachers had πο advance knowledge, 
to’ be-dod of the manner in which the rating was 
So μας, Most of the teachers were already 
ο ο with their students’ 
the sae that little review was required; on 
heir st a appeared that these teachers knew 
in Stud T ents better than the college instructors 
two um had known theirs. During the first 
Were a of the reading period, the teachers 
Quiet σου. in groups of two or three toa small 
e dissüss in which the rating procedure could 
Progress, sed at leisure. Each group discussion 
ated an ed to the point where each teacher indi- 
in Stug Pu edm of the principles evolved 
of tud. i Then each teacher was givenaset 
Wieden eh group of his students and was 
eaGhene ren es the cards in rank order. The 
Or theis had unlimited time and a quiet place 
the teaa work. Upon completion of the ratings 
ρα ιο were thanked and were left with the 
as fini Uus that their participation in the study 
Sked to hed, At the end of the week they were 
6 maxi do the ratings again in order to insure 
ed to or degree of accuracy; it was explain 
ions would that their ratings on the two occas- 
have no d be averaged, but that they sho uld 
ency Emon whatever for aiming at consis~ 
With co ince there was no reward associated 
tude was Og, and since the teachers' atti- 
ina stud entirely that of desiring to be helpful 
Titer? y they considered significant, it is the 
mind belief that the raters achieved a high 
The τν carefulness and conscientiousness. 
RUE] ratings for each student were then 
e e to determine the rating to be used in 
relati udy. Reliability was estimated by cor 
howeves the first sets of ratings with the second; 
ion T it is to be expected that the combina- 
ither both ratings would be more accurate than 
finaly Set of ratings alone. All ratings were 
Within ο. in terms of percentile ranks 
ups 
{2) English Course Grades: c 
Semester final semester grades in English for 
Siten -r 9. Ὁ, V, and ΒΥΘΣΕ averagedfor each 
Schoo] ae had been in attendance at the same 
ents 5T that length of time. Most of the stud- 
hadtrarce in this category. For students who 
Brades ον ο θά from other schools, only the 
Were Obtained at the school currently attended 
used. The course grades were typical in 


ay 


As a secondary 


18 
Professor glleage!e performance of 8 role similar 


v 
alue in establishing continuity- 


y conduc ted by 


16 
An unpublished study previousl 
tandard devi 


Or 
Sin&l blue books and the Cop: 


Wn that insofar 88 means, 8 ations ani 
ies are equivale: 


HUDDLESTON 191 


that they represented the students' success in 
all the types of performance evaluated in sec- 
ondary school English courses and not composi- 
tion alone. It was felt that averaging the four 
grades would minimize the effect of unreliabil- 
ity in grading, and that omitting grades from 
other schools would rule out the effects of dif- 
ferences between schools. It is recognized, of 
course, that teachers in the same school differ 
in their marking standards, and no claim is made 
for this criterion as a highly reliable measure. 
However, since composition is an important part 
of English course work, and since success ina 
composition test is often used for guidance and 
placement, it is of interestto observe the ex - 
tent to which the predictor variables and the rat- 
ings are related to course achievement. 


(c) Results 


(1) Reliability of Predictor Variables: Inthis 
study the reader reliabilities were computed by 
hod which yielded results based upon à 
gs of each paper, rather 

in Study I. In order 

d to a question might 

f each paper in the 
reproduced photograph- 


a met 
large number of readin: 
than on two per paper as 
that each reader assigne 
have an individual copy ©: 


sample, the papers were 
ically. 13 The papers were selected with the in- 


tention of including answers which varied in 
quality—a selective factor which would be ex- 
pected to produce a higher reliability than if the 
papers had been chosen to be representative of 
the entire group of candidates. 

For each student's paper, the approximate 
«true score" was defined as the average score 
assigned it by all readers who read that paper. 
Thus, an individual reader was ‘‘reliable’’ to 
the extent that his scores corresponded with the 
«true scores." Complete data were available 
for 39 readers of the essay question and for 30 
readers (two groups of 15 each) of the paragraph- 
revision section. The «true scores" then, were 
based on 39 observations in the case of the es- 
say and 15 observations in each of the analyses 
of the paragraph-revision. 

The formula for the reader reliabilities was 
based on the assumption that the obtaining of 
«true scores” makes it possible to relate these 
scores to the formula for the index of reliability. 
Thus, for an individual reader, the correlation 
between the scores he gives and the true scores 


to that he played in Study I was of great 


the College Entrance Examination Board had 
d intercorrelations are concerned, the 


199 JOURNAL OF EXPERIMENTAL EDUCATION 


i i lia- 
is defined as the square root of his re 
uU Txx1). The reader’s reliability, then, 
would equal τό... The formula was developed 
as follows:14 


Σκι Ξ yet = 
Txt = Noxot E 
(o? = Ex etc.) 


"SP IDEA ID 
= 2x?) (1 pt2 
(ze) ze) 


2 
Tx = —_(Ext Reliability for an individual 
(513) (zt?) reader 


A ZIX0* ^ Over-all reliability for n 
(ΣΣΧ2) (n2t?) readers 


The reliabilities for the Objective English 
section and for the Verbal Score were computed 
by correlating scores on the odd-numbered it- 
ems with scores on the even-numbered ite ms 

-Brown prophecy 
formula, For the Objective English section the 
reliability is not affected by Speededness since 


All the reliabilities are reported in Table 
XI. These reliabilitie 


The highest reliability, . 9g 
the Verbal Test. The Obj 
which occupied Only about 2 
time, attained a reliability of , 78. 
60 minutes of English Composition 
voted to objective questions the rel 


(Vol. XXII 


estimated by the Spearman-Brown prophecy 
formula, would have risen to .91. TA 
The Paragraph-Revision demonstrated d 

reader reliabilities, ranging from .69 to . mE 
Paragraph A was the better of the two paragr: αν , 
showing for two groups of readers reliabili μα 
of .8land. 84; this is approximately ν᾿. 
to the reader reliability of , 83 obtained for ras 
agraph A in Study I. The reader image | E 
Paragraph B rose from - 59 in Study I to .6 pim 
-12 in Study IL. τί may be concluded that a p: «4 
agraph-revision test may achieve a highly E 
isfactory reader reliability, particularly if τ 
entire hour were deyoteq toit. However, t 


tion between Paragraph A and Paragraph B was 
37, only slightly higher than that obtained no 
Study I. If this Correlation is taken as an he 
mate of test reliability, then application of iola 
Spearman-Brown Prophecy formula would yi st 
an estimated test reliability of , 54; if the Ln 
Were extended to three times its length, the 5 
timated reliability Would become , 78. The nen 
ture of the material itself led the English Mu 
iners to believe that it would be possible ο με 
velop questions Which would show higher pe 
Correlations, thus yielding higher estimate 
liabilities, - 
The reader reliability for the essay questi 6 
is a little lower than that obtained in Study L «Ὃν 


st 
fora twenty-minute Question and . 68 for 8 ki 
containing three such questions remain rea 
able, 


to 
of ratings by each instructor were computed ys. 
8ive an estimate of the reliability of the ra 

The Spearman ranks 


Ξ1-. 65ΣΏ2 
P N(N? -1) 


, The results are Presented in Table XII, 
Ing à range Of . 86 to . i 


ee 
memory of hig Previous ratings must have b 
Operative to Some ext, 

Was made to 


τ sable 
regards the obtained ratings as highly rel 
in that they represent considered and consi 


March, 1954) 


HUDDLESTON 


TABLE XI 


RELIABILITY INFORMATION ON PREDICTOR VARIABLES 


id-even correlations 


Reliabilities of Objective Measures (od 
hecy formula) 


corrected by Spearman-Brown prop 


. No. Papers 
Variable Reliability in Sample 
Objective English . 78 500 

. 96 500 


S. A. T. Verbal 


Reader Reliabilities for Essay and Paragraph-Revision 
Reader No. of No. Papers 
Variable Reliability Readers in Sample 
Essay .62 39 38 
Paragraph A* .84 15 39 
Paragraph A* .81 15 40 
Paragraph B* .69 15 38 
"2 15 38 


Paragraph B* κ 
ders were used for each of the para~ 


*Two groups of rea 
graphs in the Paragraph-Revision section. 


TABLE XII 


CORRELATIONS BETWEEN FIRST AND SECOND 
RATINGS BY INSTRUCTORS IN THE FORTY- 
FOUR CLASSES 
puted by the Spearman ranks formula) 


(Com 


No. of Relia- No. of 


Relia- No. of Relia- 0. 
bility Classes bility Classes 


bility Classes 
2 .95 .90 1 
.999 p 89 


Median f = . 9" 


193 


194 JOURNAL ΟΕ EXPERIMENTAL EDUCATION. 


estimates of the students' ability. | 
(3) Combination of Data: The key problem in 
a study which attempts to use criterion data from 
different sources is to combine the Criterion 
measures in Such a way that they may be regard- 
ed as being on a common Scale, In the present 
investigation it is obvious that course grades 
do not have the same meaning from class to 
class and that equivalent percentile ranks (based 


on teachers' rank-order ratings) likewiseare 


able classes. 


To solve this prob 
adjustment was utilized. 15 


velop multiple-regression eq 
etermine a Corrected criter 


ents in each Class. These corrected criterion 


(Tables XII, XVI 


measures have 
than when poole 


16. A basic formula of 


(Princeton, New Jer 


(Vol. XXII 


9 . ο . 65 
ding corrections to the original criterion Ἔν 
there being one additive correction for alls 
ents in each class. 


5 . A he cor^ 

If à single predictor were involved, t 
rected mean criterion score for class 1 (Corr 
Y) could be expressed as follows: 


(Cor, Y a 374 byx (X, - X) 


(Y is the mean criterion 
Score for all classes; byx 
is the raw-score regres 
sion of Y on X; X, is pal 
mean predictor score Ti 
Class i; and X is the 61 
predictor score for à 
classes. ) 


The adjustment constant to be added to E 
score in class 1 would then be the differen 


t 
between the Obtained mean for the class and 
corrected mean:16 


A, = (Corr. Τι) “ντ. BY «X -H byx (X, - X) 


The adjustment constant based on the r pie 
Score regression weights computed in Analy 
II for all predictor variables (j) becomes: 

Ag = ¥ = ty + (ΣΙ X, byj) - (Σ] Xj byj) 

ch 

The adjustment constants thus computedtor i 1 
Class were added to each criterion score; this 
ratings and course grades were adjusted by as 
method. A new table of intercorrelations πας 
then constructed using the new adjusted gen me 
for the Criterion variables. Multiple capi 
tions and regression weights were again CO 
puted. (Tables XV, XVIII, XIX, XX) 


Asa result of 
Would teng to be 


Crease Inr 


; e 
relations among the Predictor variables aT 


he 
ected by this adjustment of 


The adjusted validitie 
estimates with t 


8. 8. Wilke, Ellementa 
Bay? Princeton University Press, 1948). D. aoi Statistical Analysis 


— 


195 


HUDDLESTON 


March, 1954) 


IENNNNNDDD——-——————————MMÀQ——————Q————M———— O 
G8 OT ουσ ELTE οι 6076 


6,'86 185 σσ 8 0€ 6 


μο = OL ST UOTIeIAIG PLEPUIS 
πώ ne cisplatin in i 
19 ες 11.991 ῬΙ Ὃν µιας T6°8L 


αμ OL'8 vP'I£ T0°6& 9091 


61'06 uea nN 
Oo oo mmm 
$9 ος 6€ 9Ρ 


G9 τς 113 ος 92 19 1591, ΤΈΩ1ΘΛ (TT) 
G9 79 69 ev yo ras ας v9 Gp 08 (e'v'e'z'1)'3us "191, (OT) 
9e v9 02 τς ος 91, 68 gT pI 92 (α΄) 1ο], ΔΟΗ ᾿ατᾶ (6 ) 
6ς 69 ος ας Ge 61 GI 26 89 ee (60 T&L Avssa (8 ) 
9Ρ ev τς ασ 6S LZ 9T ZZ Οζ Ob peu ‘Zug ᾿9ΔΥ (L ) 
Q9 τα 92 ος 6S ασ 61 τε 0€ ras s3uney s. 44sul (9 ) 
Ie ZS 91, 61 LZ ασ LE 8T rai ZZ g udeiSereg (6 ) 
0€ ος 68 GI 9T 61 LE eT ZI Ze y uderSevied (p ) 
ας v9 81 26 σσ τε gI el ee σε θ[λ1ς---ΔΈ55π (e ) 
8c Gp pI 89 ΟΖ ος Z1 [41 ες 61 quayuoD—Avssq (7 ) 
I9 09 96 ος OF ras ζζ Ze ) 


σε 61 usra3ug eAnoetao (T 
νιν pcr mg teeth IAI 
(TT) (OT) (6) (8) (1) So]qetieA 
DT —————____— ne 


(oz? = N) 
STMOOS NOINALINO GALSALACVNN 
HLIM I SISA'IVNV !S3 Ig VIHVA TIV ODNOWY SNOLLV'IHSHHOOHSLNI 


ΠΧ H'IGV.L 


JOURNAL ΟΕ EXPERIMENTAL EDUCATION 


196 


61'6 68“; €9°0I σε τι  49'9 82°82 E's 9}, LL‘ G8 Ῥ 18 ‘PT UOT}EIAV pivputjg 
19 ‘ES ΕΙ "9861 ΥΙΟΣ  LL'G6 1638! 061 οι ΡΡ Τε 10:66 οι οἵ 61°06 ΠΈΘΙΙ 
T — ο...”  . OS 

ZL PZ £c Ig Ez ος 1591, ΙΈΩΙΘΛ (TT) 


69 9c 9g ZL 

6S 09 69 09 gg LP IS £9 ΟΡ 6L (s'p'e'c'r):3us 301 (01) 
82 09 LT ϱε 8c 6), 88 GI er ος (G6'P)30L'AeH red (6 ) 
9€ 69 LT 86 LE 81 ol I6 89 I£  (£'c) TOL Aessq (9 ) 
cL 09 06 86 6L 86 ZZ ος Gc GG ορταιϱ ‘suq ‘aay (j ) 
CL 89 82 LE 61, LZ 6I 96 Ic ες sSurjeH 5, IJSUJ (9 ) 
vc LP EL gT 82 LZ c£ LT zr LI g ude1Svreg (G ) 
εζ IG 89 eI ZZ 61 c£ 0T II LT V udeiServq (p ) 
Ig £9 GT 16 ος 99 LT 0I ec 6ζ 9]415---4ε65π (g ) 
LZ OF eT 89 ας Τζ IT II ££ 6I 3uejuo)— £esss (z ) 
02 τε GG eG LI AT 6ζ 61 {519151 eAnoefqo (T ) 


9G 6L 
EENE O μμ η 
(z) (τ) 5ΘΙ4ΈΙάΈΛ 


(TT) (OT) (6) (8) (L) (9) (c) (p) (£) 


| (065 = N) 
SW4HOOS NOLLVIAAC SSV'12-NIH.LIA 
H.LIA II SISA'IVNV :SS'IHVIHVA TIV DNOWV SNOLLV'I3HHOOWSLLNI 


AIX MW'IGV.L 


197 


HUDDLESTON 


March, 1954) 


να ce es eee 
8S OT Ep 86 


€L ΤΙ 


96 "ΤΙ 


961, €6'0£ I8°S #2°8 06 6 LO'S 9}, ST UOT} eIAVG pieputjg 
19 “ES 11981 pior  LL'9G 1689, OSL οι ete 10366 oLar 61°06 ΠΈΘΙΝ 
G9 95 6g LL 9), I 0€ ας ας 19 1591, ΙΈΩ1ΘΛ (TT) 
G9 79 69 G9 £9 vas GG v9 Gp 08 (e'p'e'c'r)'Sug “OL (OT) 
95 v9 0c Le Gg 9], 68 81 pI ος (G'p)30L'AeN-'ied (6 ) 
66 69 οζ Iv OF 61 GI c6 89 eg (εὍτειοι, Avessa (9 ) 
LL Q9 LE Iv c8 ος 6c 66 ος 09 ορτας ‘Zug ‘aay (L ) 
91, £9 Ge OF σ8 eg 92 6ς Zz 89 sSuneu 5,α15πΤ (ο ) 
Ie σα 9), 61 eg e£ LE 81 rat Sz d ydeiseieg.(¢ ) 
0€ ας 68 GI 62 9Z LE el ZI σζ y ydeaseieg (p ) 
Gg 79 8T Z6 66 66 8I el ee σε e1A1g— Avesss (8 ) 
82 Gp pI 89 ος ZZ eT ZI ee 61 juo1u0)— Ávsss (z ) 
I9 08 9Z e£ 09 9G ZZ ZZ Ze 6T usri2ug θληοοίαο (1 ) 
(IT) (OT) (6) (g) (1) (9) (c) (v) (€) (z) (τ) 59]ΩΈΤΙΈΛ. 
(0c? = N) 


SHYOOS NOIHSXLBRIO QSLSOfqv 
HLIM II SISA'IVNV :S4'IHVIHVA TIV DNONWV SNOLLV'IHHHOOMHSLNI 


AX ATAVL 


(Vol. XXII 


JOURNAL OF EXPERIMENTAL EDUCATION 


198 


Sere Ὁ υ,. 
| 66Ῥ 6291 ΠΟΠΈΙΛΘΠ ρατερατε]ς 


—————————— —— νιν ἵννννννν 


09 (Gc'p'e'c'r)'Sus 3301 (OT) 


0g (6-1) 1ο, ΔΟΝ ‘xed (6 ) 
δε  (£'z) I&101 Aessy (g ) 
PP ΘΡΈΙΌ ‘sug ‘aay (4 ) 
8Ρ Ssuljey 5, 11511 (9 ) 
ασ g ydeisered (ο ) 
PZ V ydeiseied (p ) 
PE 9]415---Λε655η (¢ ) 
61 ΊπΘ1100---Λε55π (z ) 

[51513 oAnoefqo (T ) 


ερ 8ό  9€'II οτι 168 66 8ξ  €£6'G G^ 09'6 
NM CC dl es 
T£'66T 9ι Τρ 98°LS 308  06'6P  6p'6 L'E G'OR ΤΡΙ 69°26 uve 
i S 
99 TL ΘΡ TS GG GG 49 oP 
99 £6 L6 L6 LL L8 GG GT 
TL EG GE C6 v6 ST c6 89 
8p LG 66 ας 86 LT 66 Τό 
Τα LG σε GG 96 0€ Τε Ic 
GG LL v6 86 96 LE 66 9T 
GG L8 ST LT 06 LE GTI 60 
L9 GG 66 66 Τε 66 GT GE 
Gp GT 89 Pg Ic 9I 60 ας 
8v ας [4^ pE 6I 


08 0€ Τε PP 
a ees 
(OT) (6) (τ) SO[QUIIVA 
------------------------------Ἔ M 0008 ο ο ο ο aa Iaa aI I uu oU 


(£9, = N) 
SHUOOS ΝΟΙΗΠΙΙΗΌ AALSALAVNN HLIM I SISATVNY 


53, ΙΥΒΗΉΛ DONIGN'TOXY SATAVINVA ὈΝΟΙΝΥ SNOLLV'IdHHHUOOWSLNI 


TAX H'ISV.L 


τα ως. S 
m MM 
L4 


March, 1954) 


(4) Interrelationships Among Variables: In 


Table € 
5 XIII-XVIII are presented the intercorrel- 


atio 
sums ος all the variables studied. Tables 
represent the 420 cases who took the 


Verb 
al test as well as the English test. The en- 


tire 
abis xo of 763 cases are represented in 
Verhsl o dapes which do not include the 
the stud . For an analysis of the results of 
Ν dittan only Tables XIII-XV are needed; the 
whaler a tables were computed to determine 
lost by tni significant information was being 
who did enforced dropping of the 343 students 
not take the Verbal test. Comparison 


of t 
he two sets of tables will indicate that the in- 


ter ζ 
UE μας for the two groups are approxi- 
The e same. 
in d π᾿ and standard deviations reported 
hree ic tae reflect the fact that the 
computin p of the English test were weighted in 
ive En τ he total English score. The Objec- 
of 3 "ip and the Essay each receiv 
ed 1 ira the Paragraph-Revision was 
on the ba is weighting was done early 
a Sarina of of an observed range of scores on 
equalize E 100 papers, and was intended to 
he report i contribution of the three parts in 
iS syst ed Scores. The full data indicate that 
iaia weight the Objective Eng- 
ing τ φρο qu too heavily; however, 
Except eas affect any of the intercorrelations 
For is involving the total English score. 
tionshi e following discussion of the interrela- 
Silo E variables, Table XV will be 
ΑΕΕ te re it includes all 
Para e analysis for adjusted cri 
lowest graph-Revision continues to show the 
ough ee with other variables, al- 
and B e correlation between paragraphs A 
ην from . 33 in Study I to . 3 in Study II. 
Braphs r, unlike the finding in Study 1, Para” 
than w; A and B correlate higher witheach other 
eee other variable. 
Correlati ive English continue 
T apko Vertas (. 61) 
lish ο. MM the correlatio 
U in ο. this particular essay question (Essay 
3 πα... I) was higher (. 47) in Study I. In the 
is iw tudy the correlation of Verbal with Es- 
With Es 9, as compared to à correlation of .36 
aitorra S m in Study I. When the present 1n- 
ing ο MB are corrected for attenuation, 
PRG aa reliability of . 41 for the es- 
beige h no difference appears in the order 
SSay lonship. The corrected correlation of 
ate th With Verbal is . 62. Thi 
vap, © indication in Study I that these three 
ge eis are measuring the verbal factor toa 
Say xtent. However, contrary to Study I, the 
With uo n now shows a higher relationship 
the |. Verbal than with Objective English, so that 


S 2A 
upposition that the Essay measures ? lang- 


weight- 


s to show ἃ higher 
than with Essay 


HUDDLESTON - 


eda weight 


in the study 


this weight- 


variables and rep- 
terion scores. 


n of Objective Eng- 


s would corrobor- 


uage ability i iti 
ποπ. ο αι to verbal ability is not 
Surprisingly enough, t j 
the two criterion vds μωρό wa poy 
While this approximates the median int Aca 
relation of . 85 found in Study I, it aut apa 
appeared to the writer that the criteria ae j 
be more independent of each other in tł eu 
ent study where average grades over tno pem 
of work are employed and where many o dba 
course grades had been assigned by hm e 


not participating in the st 
udy. The tw iteri 
are, moreover, consistently Woche her Ἵν 


each other in their relationships to other j 
ables. While it seems on this basis that μα 
painstaking process of obtaining ratings mi ht 
have been dispensed with, it is Sonus one 
warding to note that a highly consistent ee 
ion of the students’ ability has been attained 
Composition is the most heavily emphasized 
aspect of secondary-school English courses 
and thus must be regarded as a strong compon- 
ent in course grades. It is to be expected that 
some criterion unreliability is present, but the 
correlations of . 76 and . 77 with another vari- 
able (Verbal) indicate that the criteria inthis 
study are much more reliable than is ordinarily 
thought possible either for school grades or for 


subjective ratings. 

There are wide 
the predictor variables. 
standing, with validities of . 76 and 1% The 
Objective English is second best, with validities 
of.58 and.60. The other validities are: Essay 
. 40 and . 41; Paragraph-Revision, .35 and . 37. 
The ‘‘style’’ scores in 
ed higher validities (.39 and. 39) than did ‘‘con~ 
tent” (. 22 and 26), Paragraph B 
was more predictive than was Paragraph A (.26 
and .29). The total English test (. 63 and . 65) 
was more predictive than any of its parts, but 
less predictive than the Verbal. 

It is difficult to apply the correction for at- 
tenuation to these validities since the reliabil- 
ities of the criteria are unknown and since some 
inaccuracy is already introduced by the neces~ 
sity for estimating the reliability of the Essay 
questio. However, it is of some interest to 
make the best estimate possible for the criter- 
ion reliabilities and to see if there is any indi- 
cation that other variables, when corrected 
for unreliability, might prove more valid than 
the Verbal test. If for example, the Essay 
had a higher validity than the Verbal when cor- 
rected for attenuation, then it might be hypoth- 
esized that an Essay test measures a compon- 
ent of writing ability other than the verbal fac- 
tor and that if made sufficiently long would 
prove a more satisfactory measure of writing 
ability. 

From the data at hand, 


differences in validity among 
The Verbal test is out- 


it seems reasonable 


199 


JOURNAL OF EXPERIMENTAL EDUCATION 


200 


69'66 OF OL οὗ ΤΙ  GI'^9 8G°82  6p'G ος. 00 6 σι, Ῥ 81 ῬΙ ΠΟΠΤΕΙΑΘΠ ρ1Έριτης 
νιν... μμ ινών 
16:61 9L'IP 98°LS  $$'08  06'6P 6Ρ 6 ασ σε  SGP'OP ΤΡΙ  69'26 ΠΈΘΙΛ 
οὐ s MEE NM 

09 ZS 99 bP 6L (S'p'c'z'T):3ug ἼΟΙ, (01) 


c9 OL I9 LS 

c9 61 τς 62 GL 18 91 ΕΙ Pc (9 “Ῥ)ἼΟΙ, ΑΘΗ “red (6 ) 
01, 61 I? 8£ Ic eI Z6 19 c£ (6-5) ΤΕΊοΙ, Aessy (g ) 
Τ9 Tg Iv 08 9c £c 86 92 GG 9pelD Su ‘aay (/ ) 
LG 6c 86 08 86 Ic ος ZZ ζα ssuyey s,13su (ο ) 
06 GL Ic 8c 8z Ze 0ζ PI 8I g πάεαθτειεᾶ (ο ) 
cG L8 ral ec Ic c£ II 80 02 V udeiSeleq (p ) 
99 gT Z6 86 99 02 H ος Ig a14is—Aessq (g ) 
bP er 19 9c ζζ PI - 90 ες 11 1αθ]αοῦ---4τςςῃ (z ) 
02 Ig LT usm2us eAnoefqo (T ) 


61, vc Ze ος ZS 81 
σσ 9 

(01) (6) (8) (L) (9) (c) (5) (£) (6) (τ) S9[q?IIVA 
Oo ——— ——————d——.  /J! NI 


(£94 = N) 
SHYOOS NOLLVIASG SSV'IO-NIH.LIA H.LIA II SISATVNV 
2531, CIVHHHA ONIGNTOXA SS'IHVIHVA DNOWV SNOLLVTANNOOUTLNI 


MAX N'ISV.L 


201 


HUDDLESTON 


March, 1954) 


staan cept i SO i X ANNE c —————— 
ev 86 


9061; οσι %99 yo'oc 565 GL 09 6 66 Ῥ 62 ST UOT} ELAS PIEPURS 


tp CS E chs κκ EE ree ut EEUU MdcauciEsscdsx 
15661 OL TP 


98°LS ΡΒ 066; Οὔ Ὁ σος «Ὃν IPLI 0826 


ural 
πω ee οκ eee eee. cr ee eee 
99 


Th 69 I9 GG GG L9 Gp 0g (e'y'e'c 1) Sua 191, (OT) 
99 οὖ ας νε LL L8 c GT ος (e/y)aor'aeu-'red (6 ) 
IL ος ev Iv ντ GI c6 89 νε (6/0) TOL Aressy (8 ) 
59 ας ev 61, ζε 9 OF 8c 99 oper απ “aay (1 ) 
I9 τα Iv 61 66 ας 6€ vc 9G 53πτγἑ} 511511 (9 ) 
GG LL vc σε ες LE οὖ 9T ας a ydeiseieg (ο ) 
ας 18 GI 9Z ασ LE GI 60 vo y ydeaseieg (p ) 
L9 ZZ Z6 0v [19 ZZ GI Ge τς θ1415---4Έδεπ(ς ) 
Gp GT 89 8c vc 9T 60 ας 61 juo1u00) —Á*ss5 (2 ) 
08 0€ νε 9G 9G GZ vc νε 61 ystu eAnoefqo (T ) 


5ο]αττατΛ 


(£94 = N) 
STUODS NOIMALINO GALSNLAV HLIM M SISATYNV 
LSAL IVANAA DNIGNTIOXA SHTAVINVA DNOWY SNOLLV'ISHHOOHSLNI 


MAX ΠΠΗΥΊ, 


202 


JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE XIX 


(Vol. XXII 


MULTIPLE CORRELATIONS FOR ALL PREDICTOR VARIABLES WITH CRITERIA 


(N = 420) 


Predictors Multiple Correlation with 


Analysis I Instructors’ 


Ratings = . 68 


( 1) Objective English «18 
( 2) Essay— Content *,02 
( 3) Essay—Style μμ 
(4) Paragraph A -.03 
( 5) Paragraph B . 05 
(11) Verbal Test .50 


Analysis II Instructors: 


Ratings = , 75 


( 1) Objective English 
( 2) Essay—Content 
( 3) Essay—Style 


( 4) Paragraph A 
( 5) Paragraph B 
(11) Verbal Test 


Instructors’ 
Ratings = , 79 


with beta-weights: 


with beta-weights: 


with beta-wej : 
( 1) Objective English κ en 
( 2) Essay—Content =; 03 
( 3) Essay—Style . 13 
( 4) Paragraph A . 00 
( 5) Paragraph B . 09 
(11) Verbal Test .60 


Average English 
Grade = .50 


with beta-weights: 
.15 
.04 
.04 
-.02 
.13 
.30 


Average English 
Grade = . 75 


with beta-weights: 
.19 
. 02 
+ 10 
. 02 
. 08 
.55 


Average English 
Grade = . 80 


with beta-weights: 
.18 
. 02 
.10 
. 03 
.08 
.58 


ο μμ 


T 


M 
arch, 1954) HUDDLESTON 


TABLE XX 


MULTIPLE CORRELATIONS WITH CRITERIA FOR PREDICTOR VARIABLES 


EXCLUDING VERBAL TEST 
(N = 163) 


Predictors Multiple Correlation 


Instructors’ 


Analysis I 
Ratings = . 53 


with beta-weights: 


(1) Objective English 
(2) Essay —Content 
(3) Essay —Style 

(4) Paragraph A 
(5) Paragraph B 


Instructors' 


Analysis II 
Ratings = .59 


(1) Objective English 
(2) Essay— Content 
(3) Essay —Style 

(4) Paragraph A 

(5) Paragraph B 


Instructors’ 


Analysis ΤΠ Ratings 63 
i =. 


(1) Objective English 
(2) Essay —Content 
(3) Essay —Style 

(4) Paragraph A 

(5) Paragraph B 


with 


Average English 
Grade = .5Ι 


with beta-weights: 


Average English 
Grade = . 62 


with beta-weights: 
.44 


Average English 
Grade =. 64 


with beta-weights: 


204 JOURNAL OF EXPERIMENTAL EDUCATION 


i iabili iteria at .82 
timate the reliability of the criteria a a 
ο το of each with the other. T hen 
the correlations between the following pairs of 
variables, when each member of a pair is infin- 
itely long, are: 


Essay vs. Average English Grade ;TL 
Essay vs. Instructors' Ratings «69 
Objective English vs. Average 

English Grade B 
Objective English vs. Instructors’ 

Ratings . 183 


Thus, no evidence appears that either of these 
English tests is as closely related to wr iting 
ability as is the Verbal test which attained un- 
Corrected validities of . 76 and . τη. 

(5) Components of Writing Ability: Tables 
XIX and XX are presented to demonstrate the 
relative extent to which each of the test vari- 
ables contributes to a measurement of writing 


multiples obtained when the Verbal test is elim- 
inated, 

Examination of the beta-weights in Table 
XIX indicates that the Verbal test is the chief 


: wn by the lower 
multiples (.63 and . 64) in Table xx than in 
) 


. 7 -weights 
in Table XIX might appear to indicate that cer- 


«80 with the zero-order validity coefficients of 
-76 and . 77 attained by the Verbal test alone. 


(d) Conclusions 


1. The reliabilities of the Ob 
section (. 78) and of the Verbal t 
satisfactory. From the view 


ient of . 91 for an unspeeded test. 

2. The reader reliability of the Es. 
tion was .62. This is unsatisfactory, 
larly in view of the fact that the true r 
must necessarily be lower, 


Say ques- 
particu- 
eliability 


(Vol. XXI 


3. The Paragraph-Revision section showed 
high reader reliabilities, . 84 and . 81 for Par 
agraph A and . 69 and . 72 for Paragraph B. M 

4. Paragraph-Revision showed low d 
Ships with other variables; it is uncertain w 
is being measured by this type of material. i 

5. The verbal factor is the only identifiab E 
factor measured by the Objective English sec 
tion and the Essay section. 

6. Instructors! ratings and course Rises uin 
are highly correlated (. 82), indicating that s τ 
criteria were attained. Intercorrelations X ie 
other variables yield no evidence to support 
indication in Study I that the two criteria are 
measuring different traits. ied 

7. The Verbal test is more closely rela vm 
to writing ability as defined in this study (co e 
relations of . 76 and . ΤΊ) than is any other Tis 
iable. The other variables, when combine iul 
a multiple-regression equation with the ας 
test, fail to add appreciably to the relations 
to writing ability demonstrated by the Ver cem 
test alone. There is no support for the effe 
iveness of the English variables in measur d 
any language ability other than the verbal fa 
tor. 


F. General Conclusions 
——usEnera: Conclusions 


n t 

The investigation points to the conclusion A 
in the light of present knowledge, measura pil- 
"ability to write" is no more than verbal ἃ by 
ity. It has been impossible to demonstr ate ay 
the techniques of this study that essay ques ex" 
objective questions, or paragraph-r evisos. ; 
ercises contain any factor other than ver gure 
furthermore, these types of questions men 
writing ability less well than does a typi en the 
verbal test. ‘The high degree of success com?- 
verbal test is, however, a significant oute 

The results are discouraging to those br. 
would like to develop reliable and valid e$ ο 
examinations in English composition—2 h m- 
that is now more than half a century old. gible 
provement in such essay tests has been Popers 
up to a certain point, but professional wer a 
have long since reached what appears to P pasi? 
stone wall blocking future progress. to be 
knowledge of human capacities will have t9 sr 


à ro” 
l Testing Service €: fund? 
a comprehensive factor study in which man} τρ 


sonal 3: 
types of exercises both new and traditio e Z 


like to endorse Such a study as the only% is 
tus means of adding to our knowledge "^, sig” 
field. Even then, it appears unlikely that 


Μ. 
arch, 1954) HUDDLESTON 


See progtesn can be made without further 
ions in th i j 
keme e area of personality meas 


BIBLIOGRAPHY 
l. Adkins, Dorothy. Construction and Analy- 


sis of Achievement Tests (Washington, 

TN. U.S. Government Printing Office, 
2. Anderson, H. A., andTraxler, A. E. ‘The 

Reliability of the Reading of an English 

Essay Test: A Second Study, ’’ School Re- 
s, age XLVI (1940), pp. 521-530. 

- Asher, E. J. The Reliability and Validity 
of the Kentucky General Scholastic and 
Kentucky English Test, Kentucky Person- 

4 age Bulletin, 1938, No. 21. 

+ Averill, Lawrence A. ‘‘Some Uses of the 
ACE English Test in Worcester Teachers 
College, ” School and Society, LXI (1945), 

g, DP. 253-255. 

: Berg, Irwin A., and others. ‘The Use of 
an Objective Test in Predicting Rhetoric 
Grades, '' Educational and Psychological 

6. p Measurement, V (1945), pp. 429-435. 
uros, Oscar Krisen (Ed.) The Nineteen- 
Forty Mental Measurements Yearbook 
(Highland Park, N. J.: The Mental Meas- 

1. Qua dents Yearbook, 1941). 
ade, George N. A Study of the English of 
University Freshmen (Fayetteville, Ark.: 
Department of Education, University of 

8 c Arkansas, 1926). 

- California Test Bureau. Standardized Tests, 
1951 Catalog (Los Angeles: California 

9 P iss: Bureau, 1951). 

* Carroll, John B. “The Factorial Represent- 
ation of Mental Ability and Academic A- 
Chievement, ^' Educational and Ps cholog- 
sep Measurement, IU (1943), pp. 307- 


10 
` College Entrance Examination Board. Forty- 
Sixth Annual Report of the Executive Sec- 
retary, 1946 (New York: College Entrance 
11, ο E*amination Board, 1946). 
rawford, Albert B. and Burnham, paul 8. 
ement (New 


Forecasting College Achiev 
19. » Haven: Yale University Press, 1946). 
arsie, Marvin L. ‘The Reliability of Judg- 
ments Based on the Willing Composition 
Scale, ” Journal of Educational Research, 
13, pv (1922), pp. 89-90. 
Oppelt, Jerome Edward. The Organization 
of Mental Abilities (New York: Teachers 
14, „College, Columbia University, 1950). 
dgeworth, F. Y. “The Element of Chance 


in Competitive Examinations;’ Journal of 


the Royal Statistical Society, LIII (1890), 


Pp. 460-475, 644-673. 


15. 


16. 


It. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


21. 


28. 


29. 


205 


Edmiston, R. W.and Gingeric s i 

Relation of Factors of Aea Όρο d 
Composition, ’’ Journal of Educational Re- 
search, XXXVI (1942), pp. 269-271. 

Ellison, M. L. and Edgerton, H. A. ‘‘The 
Thurstone Primary Mental Abilities and 
College Marks, " Educational and Psycho- 
logical Measurement, I (1941), pp. 399- 

Engelhart, Max D. “Examinations, ’’ Ency- 
clopedia of Educational Research (New 
York: Macmillan Co., 1950), pp. 407- 
414. 

Flemming, Cecile White. A Detailed Anal- 
ysis of Achievement in the High School, 
Teachers College Contributions to Educa- 
tion, No. 196 (New York: Teachers Col- 
lege, Columbia University, 1925). 

Fletcher, Frank M., Jr. and Hildreth, Wil- 
liam H. The Evaluation of an English 
Placement Test, Educational Research 
Bulletin Ohio State University, XXVII 
(1949), pp. 7-17, 21-28. 

Garrett, H. E. “A Study of the CAVD In- 
telligence Examination, ? Journal of Edu- 
cational Research, XXI (1930), pp. 103- 
108. 

Gates, Arthur I. , and others. Educational 
Psychology (New York: Macmillan Co. , 


1942). 
Glatfelter, M. E. «The Value of the Coop- 


erative English Test in Prediction for Suc- 


cess in College,” School and Society, 


XLIV (1936), pp. 383-384. 
Goodman, C. H. ''Prediction of College 


Success by Means of Thurstone's Primary 
Abilities Tests, "' ducational and Psycho- 


logical Measurements, IV (1944), pp. 125- 
140. 
Greene, Harry A. *«English— Language, 


Grammar, and Composition, ’’ Encyclo- 
edia of Educational Research (New York: 
Macmillan Co. , 1950), pp. 383-396. 

Greene, Harry A., and others. Measure- 
ment and Evaluation in the Secondar 
School (New York: Longmans, Green and 
σαι 1946). 

Grinnell, J. E. “What Makes Ability inEng- 
lish?” School Review, XLV (1937), pp. 
602-604. 

Hartson, L. D. “A Five Year Study of Ob- 
jective Tests for Sectioning Courses in 
English Composition, ’? Journal of Applied 
Psychology, XIV (1930), pp. 202-210. 

Hawkes, Herbert E., and others. The Con- 
struction and Use of Achievement Examin- 
ations (New York: Houghton Mifflin Co. , 
1936). 

Hudelson, Earl. ‘The Effect of Objective 
Standards Upon Composition Teachers’ 
Judgments,” Journal of Educational 


206 


30. 


31. 


32. 


33. 


84, 


35. 


36. 


3T. 


38. 


39. 


40. 


41. 


42. 


43. 


JOURNAL OF EXPERIMENTAL EDUCATION 


Research, XII (1925), pp. 329-340. . 
Hulten, C. E. ‘‘The Personal Element in 
Teachers’ Marks, ’’ Journal of Education- 
al Research, XII (1925), pp. 49-55. 
Krathwohl, William C. ‘Relative Contri- 
butions of vocabulary and an Index of In- 
dustriousness for English to Achievement 
in English, " Journal of Educational Psy- 
chology, XLII (1951), pp. 97-104. 
Landry, H. A. ‘The Relative Predictive 
Value of Certain College Entrance Cri- 
teria, ’’ Journal of Experimental Educa- 
tion, V (1937), pp. 256-260. 
Leonard, S. A. ««Τῃο Wisconsin Tests of 
Grammatical C 


orrectness,'' English Jour- 
nal, XV (1926), pp. 430-442, 
Lindquist, E. F, (Ed.) Educational Meas- 
urement (Washington, D.C.: American 


Council on Education, 1951), Ch. 13, pp. 
495-530. 


Lindquist, E. F, (Director). The lowa 
Tests of Educational Development for 


Grades 9-13 (Chicago: Scien 
Associates, 1942), 
McCullough, C. M. and Flanagan, J. C. 
“The Validity of the Machine-Scorable 
Cooperative English Test, Journal of 


Experimental Education, VII (1939), pp. 
229-234, 


McGann, Mary. « 
Remedial Teachi 


in Mechanics of English Made b 
Freshmen, » Journal of Educational Psy- 
chology, XXXVII (194), pp. 499-503. 
McKee, J. H. “Subjective and (or versus) 
Objective, ” English Journal (College Ed.) 
XXIII (1934), pp. 127-133. ᾿ 
Noyes, E. S. “Recent Trends in the Com- 


prehensive Examination in English, 38 


Educational Record, XXI (1940), pp. 107- 
20ucauonal Record 
119 


ce Research 


Noyes, E. S. and others. Report on the 
First Six Tests in English Composition 
(New York: College Entrance Examina- 
tion Board, 1945). 

Noyes, E. S. and Stalnaker, J. M. Re ort 


9n the English Examination of June 1937 
(New York: Coll 


ege Entrance Examination 
Board, 1938), 


Paton, J. M. “The English Examiner's 
Dilemma, " School, XXXII (1944), pp. 
324-329. 


Pressey, S. L, 


44. Ruch, G. M. 


45. 


46. 


41. 


48. 


49. 


50, 


51. 


52. 


53. 


54. 


55. 


56. 


57. 


58. 


59, 


The Objective or New-Type 


Examination, (New York: Scott, Fores- 
an and Co., 1929). " 
Sims, Verner κ... “Reducing the Var 
iability of Essay Examination Marks " 
Through Eliminating Variations in αι 
ards of Grading, ” Journal of Educational 
Research, XXVI (1933), pp. 637-647. 
Stalnaker, J. M. ‘‘Essay and Objective 
Writing Tests, " English Journal in 
(College Edition) (1933), pp. 217-222. : 
Stalnaker, J. M. -**Recognition of Po m 
versus Their Correction, " English Jour 
nal (Regular Edition), XXII (1933), PP- 
493-494, . " 
Stalnaker, J. M. ««ΤΠο Construction αμ 
Results of a Twelve-Hour Test in oes 
Composition, ” School and Society, 
IX (1934), pp. 218-224. 6 
νε M. “Απ Objective Eng lis h 
Scholarship παταω άρα, -School βαν» 
Society, XL (1934 z "DD. τινα, a 
Stalnaker, J. τ ο νωπά of ja cbe 
iversity of Chicago Qualifying Eram on) 
tions, " English Journal (College Editi 
XXXI (1934), pp. 384-388. ... 
μαι 1. A p Examinations 7 
liably Read, School and Society, X 
1937), pp. 671-672. i 
Sis, Pe M. ‘Question VI, the mu 
English Journal (College Edition), 
(1937), pp. 133-140. «ge 
Stalnaker, J. M. and Stalnaker, R. Ο. οἱ 
liable Reading of Essay Tests, '' School 
Review, XLII (1934), pp. 599-605. vittis 
Starch, D. and Elliott, E. C. Mee μι " 
of Grading High School Work in Ras et. 
School Review, XX (1912), pp. 442 leino 
Thompson, Louise M. and Haines, E ον 
M. “The Relation of College Aptitu 
Scores to Performance in College C 


47) 

Canadian Journal of Psychology, κας, 
pp. 37-40, d Inter" 

Traxler, Arthur E, “Reliability an i 


era ive, 
gorrelation of the Parts of the Cooper? 


101; 
English Test A: Mechanics of Expr (NeW 


Educational Records Bulletin, Νο. pert 

York; Educational Records Bureau, 

1944), pp. 55-58. 1d 
Traxler, 


o 
Arthur E. and Anderson, Haro g- 
A. “Reliability of an Essay Test in 


lish, ? School Review, XLII (1935), PP 
534-540 ith 


Van Wagenen 


; M. J. “The Accuracy V 

Which English Themes May be Grade 
With the Use of English Composition 0) 
Scales," School and Society, XI (1920) 
Dp. 441-450. dict" 

Wagner, M. E. and Strabel, E. “preg 
ing Performance in College English; 


(Vol. XXII 


ourse?: 


March, 1954) 


Journal of Educational Research, XXX 
di (1937), pp. 694-699. 
- Weidemann, C. C. ‘‘Scoring the Essay 
Test," Journal of Higher Education, XI 
ép, g ο), pp- 4907481. 
- Weidemann, C. C. “Review of Essay Test 
Studies, " Journal of Higher Education, 
62. τὰ (1941), pp. 41-43. . 
- Willing, Matthew H. ‘Individual Diagnosis 


HUDDLESTON 


207 


in Written Composition, ’’ Journal of Edu- 
cational Research, XIII (1926), pp. 77-89. 
63. Willing, Matthew H. Valid Diagnosis in 


High School Composition, Contributions to 
Education, No. 230 (New York: Teachers 


College, Columbia University, 1926). 
64. World Book Company. Standard Tests and 

Related Material, Spring 1952 Catalog 

(New York: World Book Co. , 1952). 


(Appendixes Iand II follow) 


208 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXII 


SAMPLE QUESTIONS IN OBJECTIVE ENGLISH * 


— " ; ; inves- 
The following items have been released to illustrate the objective English material used in the inv 
tigation. Because of the Security regulations governin 


Examination Board, it is not possible to divulge all of 
with the classifications of the items. 


g the confidential tests of the College Entr τ : 5 
the items. An answer key is provided, toget! 


Directions: This section tests your ability to write formal English effectively as well as οκ a 
each of the sentences in this Section, certain portions are underlined and numbered. On the righi 


i ra 
side of the page are suggested several ways of writing or punctuating each underlined portion, or CeT ael, 
positions in which to place it in the sentence. Choose the answer which is correct, or which soun 
and blacken the space beneath the corresponding number on the appropriate line on the answer page ull et. 
answer page which you will use for the following questions is the front cover of your blue answer 2 LTI- 
NO CREDIT WILL BE ALLOWED FOR ANYTHING WRITTEN IN THIS TEST BOOKLET OR FOR M 
PLE ANSWERS. 


inl it 
In some cases “OMIT” is given as a possible answer; choice of this answer means that you think i 
would be better to eliminate the underlined portion entirel 


^ iven- 
y than to take any of the other alternatives giV 


[ 
APPENDIX I 
1 


KA κε 


Ican scarcely be enthusiastic about his skill as 


1. (1) as (2) of being (3) as regards 
T being (4) for 
a violinist, but I must admit that he is somewhat 2. (1) somewhat (2) some | 
2 
better than many musicians who are more popu- 
lar. 
RR 

The success of any experiment is jeopardized one 

' 3. (1) inaccuracy one (2) inaccuracy, 
by inaccuracy one error may invalidate all the (2) inaccuracy; one 
μα. ; 4. (1) found (2) which are found 


(3) ΟΜΙΤ 
ΟΚ 


About 1890 Arthur Balfour Suggested, in his 5. (1) , in (2) in ; 
5 
genially devastating way that human behavior is 
6 


6. (1) way (2) way, 
not founded on thought, which 


progresses but on “τ (1) progresses (2) progresses, 
feeling and instinct, which remain almost un- 8. (1) instinct, (2) instinct 
changed 


CK 


On July 4, 1776, the American colonies declared 


themselves completely independent from British 
9 


9. (1) from (2) of | 
rule, 


Fkk 


We could not obtain the book any place in town, 
10 


10. (1) any place (2) nowhere 
(3) anywhere (4) anywheres 


— ὦ 


March, 1954) HUDDLESTON 


No one was much taken by the applicant's per- 
Sonality, though his credentials indicated that 


he was pleasant, intelligent, and friendly. 


AK 
The issue is one for we students to decide; it 
Ought not to be left to the student council alone. 
RIK 
Having decided to install the new machines, we 
discovered after several months that we would 
have to wait at least a year before they 
Were obtainable. 
14 
kkk 
The atomic theory is not new, butit is more 
Useful in its present form than i - when first 
Conceived, and the modern theory was the pro- 
duct of "A cet work, but only a few mod- 
Modern scientists deserve the credit for its 


Present usefulness. 


RIK 


Shakespeare is the most universally loved 


Stall poets, 


IK 
The organization of the United Nations is more 


c 
9mplex than the League of Nations. 
20 


xkk** 
Our staff reported yesterday that they have not 
st deci hered the message. 
RIOR 
ct, 


Some day we must undertake such a PT oje 


x not now? 


11. 


12. 


13. 


14. 


15. 
16. 
17. 
18. 


19. 


20. 


21. 


22. 


209 


(1), though (2) : though 
(3) ; though (4). Though 


(1) we (2) us 


(1) would (2) will 


(1) were (2)are (3) will be 
(4) would be 


(1) The (2) Although the (3) In spite 
of the fact that the 
(1) but (2) yet (3) OMIT 


(1) and (2) and although 

(3) furthermore 

(1) but (2) yet again (3) and 
(4) OMIT 


(1) of all poets (2) of any poet 
(3) of any poets (4) of any other poets 


(1) than (2) than was (3) than that of 
(4) than those of 


(1) have not yet deciphered 
(2) had not yet deciphered 
(3) did not yet decipher 


(D, why (2) ,—why (9). Why 


210 


JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXII 


Seton is trying to formulate a general law which 


would hold for all civilizations. 


23 


23. (1) for (2) in the case of 
(3) regarding (4) in regard to 


Answer Key and Item Classifications 


1 
2 
3 
4 
5 
6. 
7 
8 
9, 
10 


- lidiomatic expression 
: lidiomatic expression 
- 3 sentence structure 

- 3 sentence Structure 

- 1 punctuation 

2 punctuation 

- 2 punctuation 

- 1 punctuation 

- 2 idiomatic expression 
- 3 idiomatic expression 
- 1 sentence Structure 

- 2 grammar 

- lgrammar 

- 4 grammar 

- 2 sentence Structure 

- 3 sentence Structure 

+ 2 sentence Structure 

- 4 sentence Structure 

- 1 grammar 

- 3 grammar 

- 2 grammar 

- 3 punctuation 

- lidiomatic expression 


M 
arch, 1954) HUDDLESTON 2 
11 


APPENDIX II 
SAMPLE VERBAL ITEMS 


The followi 
fessis ng samples from the C. E. E. B. Bulletin of Information illustrat i 
rial included in the Verbal test in Study II. An answer key is provided. τω. vaio at μήκη nie 


* ok x x 
Directions: — ; 
mater Each question in this subtest consists of a group of four words, two of i i 
ite, ον PPosite to each other in meaning. Decide which two words in each group ΡΕ po cc E 
Sodas ea the space beneath the corresponding pair of numbers on the answer sheet; i.e., markthe 
words 2 ae the dotted lines beneath «1-27 if words numbered 1 and 2 are opposite, beneath ΠΡΙ if 
ted line à are opposite, beneath ‘‘3-4’’ if words 3 and 4 are opposite, etc. Mark only ONE set of dot- 

Or each question, and be sure all your marks are heavy and black. 


l-essential 2-classic 3-superfluous 4-disarming 
l-qualified 2-unfit S-healthful 4-primitive 

1-fleeting 2-impenetrable 3-permeable 4-perjured 
1-circumscribed 2-tedious 3-senile 4-interesting 
1-unwitting 2-serious 3-deliberate 4-mollified 
1-authentic 2-mechanical 3-spurious 4-productive 
l-dispassionate 2-illustrious 3-impecunious 4-affluent 
1-resilient 2-perspicacious 3-salient 4-inconspicuous 


95 -ᾱ σ) σι με οὐ H 


LEGE EE 


Direct; 

Cp up Each of the questions in this subtest consists of two words which have a certain relationship 

are related t followed by five numbered pairs of related words. Select the numbered pair of words which 

On the a. O each other in the same way as the original pair of words are related to each other. Then, 
nswer sheet, blacken the space beneath the number corresponding to the number of the pair you 


Ve selected, 


9. OINTMENT: BURN:: 1-tears: consolation 2-consolation: grief 
3-butter: bread 4-bread: meat 5-happiness: grief 

10. EROSION: ROCKS:: 1-flatness: landscape 2-fatigue: task 
3-fasting: food 4-dissipation: character 5-forgery: signature 


11. FIBER: FABRIC:: 1-average: aggregate 2-nucleus: cell 
3-obstinacy: deadlock 4-appurtenance: object 5-member: league 


12. REST: FATIGUE:: 1-diploma: graduate 2-laziness: obesity 
3-pinnacle: mountain 4-relaxation: recreation 5-praise: dejection 


13. SKELETON: BODY:: 1-prisoner: cell 2-law: society 3-prisoner: 
law 4-jury: sentence 5-law: jury 


Ἐκ κ ko ke x 


Direct 
cen eons: In each of the sentences in this subtest there is a blank space, indicating that a word has 
t mitted. Beneath the sentence are five numbered words; from these five words you are to choose 


e o Si ; h 
Whole. e word which, when inserted in the blank space, best fits in with the meaning of the sentence as a 


14. One of the most prevalent erroneous contentions is that 
Argentina is a country Of ago corno E agricultural resources 
and needs only the arrival of ambitious settlers. 


1-modernized 2-flourishing 3 -undeveloped 4-waning 
5-limited 


15. The last official statistics for the town indicated the presence 


of 24,212 Italians, 6450 Magyars, and 2315 Germans, which 
ensures to the .......- ..a numerical preponderance. 


212 


JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXII 


1-Germans 2-figures 3-town 4-Magyars 5-Italians 


16. Precision of wording is necessary in good writing; by choosing 
words that exactly convey the desired meaning, one can 
Ἑλ οκ oec s 


1-duplicity 2-incongr uity 3-complexity 4-ambiguity 
5-implications 


11. Various civilians of the liberal school in the British Parliament 
remonstrated that there were no grounds for .......... of 
French aggression, since the Emperor showed less disposition 
to augment the navy than had Louis Philippe. 


l-suppression 2-retaliation 3-apprehension 4-concealment 
5-commencement 


KKK ok ok x 


" : " i n 
Directions: In this subtest each passage is followed by questions based upon its content. Each questio 
consists of an incomplete 


After reading a passage, answer each of the questio; ing i i 


and blackening the Space beneath the Corresponding 
The questions following a passage are to be ἀπο, 


passage. 


To regard peace as Somethi 
wantonly chosen not to have is nai 
aspect of Europe's development. 
building up civilization and cultur 
Strong passions, because only st: 


1-prevents any possibilit 


l 


March, 1954) HUDDLESTON 213 


19. The author believes that those who call Europe a failure 
1-are in effect condemning man himself for his frailties 
2-do not realize that frequent war is inevitable 
3-are being realistic about the state of international rela- 
tionships 

4-are ignoring the position of the smaller nations in the 
general scheme 

5-are justified in believing that the European peoples 
should exert a greater effort to establish peace 


20. The author believes that Europe should not be wholly con- 
demned for its constant wars because 

1-civilization advances through war 

2-dynamic struggles are inevitable in an 
cultural development 

3-world peace can never be attained 

4-the individual nations in Europe have been unwilling 
to compromise 

5-international understan 
σα] proximity 


y effort toward 


ding hinges on more than phys- 


21. The author feels that the progress Europe ‘‘has managed to 
accomplish" has been made in 
1-trying to overcome barriers of distance 
2-defeating the forces of human fallibility 
3-laying the foundation for world-wide understanding 
4-developing its civilization 
5-establishing a workable basis for national peace 


*ck KK KK 


Answer Key 


3 8 FP U 
Αν αν co Co sR CO IO Co 


ODPADA N H 


m 
p 
W»ODO RS OI Q2 η» OI CO DO UI OI ,R. DO WWE NNR 


Se 


ΑΝ EXPERIMENTAL EVALUATION OF THE 
EFFICACY OF TWO METHODS OF TEACH- 
ING MUSIC APPRECIATION 


MORTON J. KESTON 
University of New Mexico 


The Problem 


to Erg PURPOSE of this experiment was 
relative e an experimental basis for judging the 
View in Bes iae dai of two different points of 
music ed e teaching of music appreciation. Some 
music al ucators maintain that it is exposure to 
response te, which may produce the r e levant 
ic educat ermed music appreciation; other mus- 
musie E insist that in order to be effective, 
y ο... be presented together with careful- 
this expe comments. The null hypothesis of 
nificant AES is therefore: There is no sig- 
Contro] eise between experimental a nd 
Appreciati Sses in the development of music 
tes Bars The experimental group desig- 
gether ee students who listened to music to- 
ignates ne comments. The control group des- 
Same m ose students who listened to these 
usical compositions without any com^ 


ment 
Pei eT than the title of the music to be 


Th n 
€ Design of the Experiment 


iors pty nine sophomores, juniors, and sen- 
Sity of ts University High School of the Univer- 
Study pe innesota volunteered to relinquish their 
In i LM in order to join the experiment. 
lmenta] to avoid bias in the formation of exper- 
Student, 4 control groups, the names of 811 
9 a sin ο each class period were gathered in- 
then eub © group, arranged alphabetically, and 
of w ee alternately into two groups, one 
erime randomly determined, became the 
trol gro ental group, and the other one the con- 
Was Set up. In addition, a zero control group 
Students P A near-by high school choir of 24 
Ebtemb was tested for music preference in 
and aps C7 at the beginning of the school year 
Year Bain in May toward the end of the school 
Vide a Nes zero control group was used to pro^ 
"eer inia for estimating what changes in 
viron: preference may be expected from the 
Perim mental factors not controlled in the ex^ 
ental situation. 


The over-all procedure was to determine the 
music preferences and other related measures 
of these students at the beginning of the school 
year, subject the several groups to the differ- 
ential treatment for an entire school year, and 
then re-measure the music preferences and 
other variables to note significant differences, 
ifany, as a result of the differential treatment. 

The activity of the control groups consisted 
only of listening to records, the titles of which 
were announced before the records were played. 
The experimental groups, however, not only 
heard these records, but, inaddition, were 
subjected to lecture material designed to arouse 
interest in the music to be heard. Approximate- 
ly half of the class time of the experimental 
group Was devoted to listening to music, and 
the other half was given over to a discussion of 
the music. The organization of recorded ma- 
terial presented to both control and experiment- 
al groups was chronological, beginning with the 
works of Bach and his predecessors and ending 
with the outstanding compositions of the twen- 
tieth century. An extensive testing program 
took place at the beginning of the school year 
and again at the end of the school year. 

The following tests were administered to the 


experimental and control groups: 


1. Oregon Music Discrimination Test. This test 
is administered by means of phonograph record- 
ings. The student is required to judge which 
of a pair of piano selections he prefers. One 

of the selections is taken from the original 
composition; the other member of the pair is a 
distortion of the original composition. There 
are forty-eight such pairs. This test, presum- 
ably a music discrimination test, was given 
both at the beginning and at the end of the year. 


2. Seashore Measures of Musical Talents. This 
test is also administered by means of phono- 
graph recordings. Pitch, Rhythm, and Tonal 
Memory, three of the six Seashore measures 
of this battery, were used. Loudness, Time, 
and Timbre, the other three measures of the 


916 JOURNAL OF EXPERIMENTAL EDUCATION 


imi be- 
were eliminated from the study 
sip | ee qualities seemed remote from the 
factor under investigation, the development of 
musical taste. 


3. Kwalwasser-Ruch Test of Musical Accom- 
plishment. This isa paper-and-pencil test 
ind measures musical knowledge of such factors 


as musical symbols, time Signatures, note να]- 
ues and the like. 


4. Kwalwasser Test of Musical Information and 
Appreciation. This is also a pencil 


5. Otis Quick Scoring Mental Ability Test, This 
test was administered in order to obtain a meas- 


ure of the intelligence of each Student in the 
Study. 


6. Keston Music Preference Test. Because it 
Was suspected that the Oregon Musi i 


ms of four Selections 

-faced acetate 
record discs, ection lasted 
forty-five Seconds, so that each 


(4) Category D, “swing”, 
4X3x20r 24, possible arra. 
four music Selections, the fir. 


1 the experts in his 
choices. These expectations were borne out in 
the administration of the t 


(Vol. XXII 


aS illustrated in Table I. og 
The scoring of the test proved to ο 
of considerable difficulty, for weightings bs 
to be determined in such a way that a πο 
would receive some credit for judgments ipe 
were approximately “correct”, A method wn 
difference was finally adopted which pie! wa 
the degree of departure of the subject from Feet 
judgments of the music authorities. 1 ως 
the higher the score of a given individual, dus 
greater his departure from the opinion of o^ 
experts, and, consequently, the lower his — 
ing on the Music Preference Test. Sco i 
range from a theoretica] perfect score of 0 A 
a theoretical maximum score of 159.6. It Les 
be noted that even the expert group had an av 
erage Score of 25. This is because these ex A 
perts reflect minor differences among eat 
Selves in musical taste, and they agen ο 
the average, 25 points from a system of ς ` 
ification used in the construction of the tee nis 
Several of them, however, came within 15 p 
of the theoretical] the 
An examination of Table I demonstrates ee, 
validity of the Scale. Operationally, there de- 
with the use of this test, it was possible to 


dents, their musical judgments were superio 

students disagreed πο μας. 

experts, their musical judgments were κ roups 
he Consistently low scores of the music g 


The Teliability of the test was determined" 
an analysis of test-retest data according = ο 
is of variance fecha 
Jackson and Ferguson (3). The results ge one 
analysis indicated the test to be a sensitive 
With an over-a]] reliability of , 95. is- 
In addition, an investigation of the cons an 
tency of the Subjects’ ranking was made py 
he incidence of circular triads- h 


, 


for ex” 
then indicated a preference 

would be present. 
analyzed, 151 r 


7 9r 11.5 percent were fou g 
contain a circular triad, The Music Prefer 


e 
ence Test 1564 in this Study was ther efor 
1. The method of grading the Music Pref, 

and reliability of the Scale wil] ieee Test ne 


ull a forthco: 
Musical Preference." 


rmination of the validity 
ming 


Psychological Mono- 


March, 1954) KESTON 


TABLEI 


MEANS AND RANGES OF MUSIC PREFERENCE TEST SCORES OF EXPERT GROUPS 
AND OF JUNIOR AND SENIOR HIGH SCHOOL STUDENTS 


Group Number Mean Range 
Music Authorities 12 25 11- 39 
Music Students 
(Sophomores, University of Minnesota) 68 44 8-101 
Music Students 
(Seniors, University of Minnesota) 11 26 7- 55 
Music Students 
(Graduates, University of Minnesota) 17 26 10- 53 


WGradiatas, University Ας μας T ur nuu eee 


Senior High School Students 
(Junior High School, University of Minne- 


Sota) 89 104 40-158 
Junior High School Students 
(University High School, University of 
120 59-156 


TABLE II 


MEANS AND RANGES OF MUSIC RECOGNITION TEST 


SCORES OF SEVERAL GROUPS 


NENNEN 


Students of Study ap pud μα 
ake School 58 2.29 0-9 
ιν School 24 6. 04 0-11 
Marshall High School 36 1.22 0- 5 


(Junior High) 


217 


(Vol. XXI 


JOURNAL OF EXPERIMENTAL EDUCATION 


218 


οίκο πο. diia eren tpi Rcs EDD ME EE dd ιν Ου ννννν ei — 
i 


TOA WL γε 9679 TAA YG ye gg "ey 
MEME ep nn ee rt 


αρ΄58005 8 S9'66£ LL  BTPOGSS OVI IT 88 WPL 


hn gee a 


OL'286°S ομσεθς I ζ8 9915! 163816 81 'ΤΙ61ζσ I uooAjog 

Ρ91999Ὶ  . 0oz'6 
££ ΙΡ G9'0G,'0e 98 £9'G02'29 L8'6T£'v, c0'86P'06 — 18 uym 
StsoyjodAH J 59ΈπὈς sarenbs Jp Αχπ σι ÁZ Jp οθοΈΏΊάΙΈΛ 
ΠΈΘΙΛ Jo wung 10 o21nog 


eee 


peonpeN 10 pojsnfpy 
TT 


LNVISNOO Sd4HOOS LSAL ΠΟΝΠΗΠωΠΗᾶ OISAW 'IVLLINI 9NIGQ'IOH 
SXHOOS LSAL FONANAATUd OISAW JO AONVINVAOD ANV ΠΟΝΥΙΗΥΛ AO SISA'IVNV 


AI ΠΤΕΝΙΙ, 

pa ist 
G9 '26£ ‘LL, 0Z ‘ZIP SIT 8T “POS ‘E8 [51Ο], 
€8'S0z 89 c0 "867 ‘06 18 '6T£ ΣΙ, ung 

99 “666 ‘82 OT “980 ‘LP 02 606 ‘FE 1011109 

LT 906 ‘FE c6 'I9Fp ep 49 ‘ITO ‘OF tgjueurtrodx 
—BHráPÁÁÓÓ—— 'üPoSe— EE... di. ΜΜ di RUNE iind 

AX gh eX 


ὪὪι---------ο ο... 


S9.IOOS UOT] ΤΡΙΛΟΠ 


T ——————————————————————————— .ύ. 


06€ ΄9Ρ6 8168 818 ΄688 8626 788 ‘P90 τ 68 umg 

grg ‘TOG 860  POT‘LPS PELP 216 ‘08S I? Jo.U0D 

ZLS ‘PGE O6LE τι σε P9SP ᾖἔι6ειΡ 8v Itjuourraodx^g 

— ÁN ELA TALL NL πας μμ ΜΗ Ιω... 
AZ XZ hu N 


Sa.I00S MEY 


T ο ο 


LNVILSNOO G'I4H SWHOOS LSAL MONWHWdWHd DISAN ΠΥΊΙΙΝῚ 
H.LIA 511005 1511, WONGIUISHd DISAN TVNIA JO AONVINVA 
-09 ANV ΠΟΝΨΙΗΥΛ AO SISATVNV AHL HOÀ AUVSSHOAN V.LVGd OISVd 


III 3'IHV.L 


March, 1954) KESTON 219 


found to be internally consistent as well as reli- Preference Test demonstrated that it was sta- 
able, for approximately only one item out of tistically permissible to pool these groups into 
every ten items manifested a circular triad. A a single experimental group and a single control 
Chi-square test revealed no significant differ- group, i.e., the groups to be pooled were found 
ence between the experimental groups and con- to be homogeneous with respect to final Music 
trol groups in the proportion of circular triads Preference Test scores. 
found, 2 2. Test for normality: A basic assumption 
The Music Preference Test was administer- of the analyses was normality of the data. The 


data of the pooled experimental group and the 


ed to the experimental groups, control groups, 
pooled control group were subjected to the pro- 


and zero control group at the beginning of the 
School year and el the ek um the school bit test of normality. Each Music Preference 


year in order to note any possible shift in musi- Test score was assigned a percentage in the dis- 
Cal preference as a des tà of the differential tribution by the use of the formula (2n 3 1)100 
N 


qeatment, A significant shift was found, a nd 
the. paper attempts an analysis of this shift and 
e part played by diverse factors in the individ- 


uals who took part in the experiment. 
T. Kest ili table. (Table IX in Fisher and Yates, ref., 1) 
Aeston Music Rec gnition Test. The ability . 
© identify musical Sampoa DnS is generally When the final Music Ὃ ae on ae " 
granted little 1 : t of were plotted against the probit values tor 
νο... dH pe the resulting points approximated a 


musica] taste f j score 

or the elements of musical taste , the rei : 

are musical judgments e musical facts. How- straight line in both the experimental group and 
^ 2 


ever, it was considered of interest in this study the control group. This is the criterion of the 
to collect some data on the ability to recognize probit test of normality, the formation of a " 
music in order to note its relationship or lack straight line when the scores are plotted as or 
μες relationship to the development of musical inates and the probits as abscissae. 


where n is the rank of a given Music Preference 
Test score in the group, and N is the total num - 
ber of scores. These percentages were con- 

verted to probits by referring to the appropriate 


discrimination 3. The analysis of variance and covariance: 

j ; ς j j d covariance is par- 
gain, however, no test was available, and The analysis of variance an 

one had to be constructed. The best known com- ticularly applicable in the analysis of these data, 
positions of each of 30 composers were selected for the fundamental question to be ansered is: 
and re-recorded on acetate discs. In the admin- Have there been any significant changes 1n the 


final scores of the two groups when the original 
scores have been taken into account? An anal- 
ysis of the initial Music Preference Test scores 
indicated that the experimental group and the 


m ation of this test, the subject listens to ex- 
ent which last forty-five seconds and indi- 
es by number which of 34 listed names cor^ 


r 
“sponds to the composer of each excerpt. One erim ο edens 
p oint 1S allowed for each correct response; the control group were significantly erent wi 


inimum : e maxi- respect to music preference ratings, i.e., the 
mum Boned lege ae Pep objectiv- groups were unmatched. However, analysis of 
‘tY of the test assured its validity. The reliabil- variance and covariance renders obsolete the 
lty of the test was determined according to the necessity for rigidly matched groups, because 

Chnique used for the Music Preference Test. by means of this statistical technique, the ei- 
i usic Recognition Test was found to be à fect of the inequalities between original music 


sensitive one with an over-all reliability of .93. preference ratings Οἱ the experimental and con- 
able IT includes the means and ranges of sev- trol groups may be eliminated as a variable in 
eral of the groups tested the experiment. ees is € ig by ad- 
In addit; "a de- ‘usting the sum of squares of the final Music 
Point sa ie mation deca iq Dam Low D reference Test scores, so that the effect of 
Students "a ie (a epee ce records the inequalities in the initial Music Preference 
gathered from the ott! Test scores of the two groups on the final scores 
he Statist; , , " is removed or eliminated statistically. Simil- 
ence Test ae Analysis of the Music Preter arly, the effects of inequalities of other factors 
eet scores such as intelligence, socio-economic status, 
s ἃν Ke à; i " and the like on the final Music Preference Test 
‘Ment St for pooling: ‘The data of this ον scores may be eliminated by appropriate ad- 


Were collected from three experimental 
S and three controlgroups. Analysis of 
nce applied to the scores of the final Music 


justments. These adjustments on the f inal 


SToup 
sums of squares of the variable under consid- 


ria 


δ. | i Bor MAS i 
An article devoted exclusively to the problem of the circular triads found in this study will 
Published in the near future. 


220 JOURNAL ΟΕ EXPERIMEN TAL EDUCATION 


j lysis of the 
tion come about through the ana sis 
ο μὲ the cross products of the variable under 
consideration and the other variable whose ef- 
fect is to be eliminated. | 
In the first analysis of Variance and covari- 


referred to as X 


Α basic assumption in the analysis of covar- 
iance is that the regression coefficients within 


groups are involved, an appropriate test is the 


Ppropriate test 


lS a t test, a test of the hypothesis that two with- 


in regression coefficients, b, and b2, obtained 
from two random samples of sizes N, and Ns 


The value of t 
-63, calculated from the data 


confirmed by the F-test. 
Table IV represents the 


5 On the fina] Mus- 
ic Preference Test Scores, 
The F value obtained in 


group. 
It is relevant at this point to run a Similar 
analysis of variance 


scores on the initial Oregon M 
ation Test is eliminated. 
The F value obtained is not signifi 


cantat the 
9 percent level, and the results of this analysis 


ς 6 
indicate no significant difference between ca 
adjusted means of the experimental group 


Mes iffer- 
control group. Inasmuch asa E eere 
ence was found in the analysis of the Mus 


erence Test scores, evidently the Oregon a 
ic Discrimination Test measures a pi | "inis 
capacity than the Music Preference Tes Bas 
comparison of the two music discrimina the 
tests indicates that the decision not to i ah 
Oregon Music Discrimination Test as a um 
portant testing device in this experimen 
Correct one. i ri- 
It is possible to carry out the aun dm 
ance and covariance in such a way that ut the 
of Squares of the dependent variable abo affects 
mean may be adjusted or freed from the retic- 
of several factors simultaneously. ie may 
ally, any number of independent im yo ua 
be included in such an analysis, but "S adit 
plexity of the process increases with t A Hoa 
tion of each independent variable. A Fanless 
limit for the number of independent vari be two. 
to be included in such an analysis would ten 
In this experiment data were collected on An 
variables other than musical ee ia 
appropriate question at this point would had on 
ask what influence each of these factors In or^ 
the final Music Preference Test scores. e Test 
der to do this, the fina] Music ΡΤ ae cii puc M 
“sores would have {ο be freed from the 1 


es ant; 
of the initial Music Preference Test score? | 


EU 
a: die Same time, from each of the ten ΤΘ 
ables, 


: aquire 
It would be pertinent, for example, to tion 
Whether or not the factor of music r d 
Was influential in determining the scor p oth- 
final Music Preference Test, The md ci 
esis to be tested was: there is no signifi pref- 
difference in the means of the final Musi¢ es 
erence Test Scores when these mean ο 
have been adjusted for any inequalities E ein" 
9 groups with respect to both factors, the in” 
itial Music Preference Test scores and ne set 
itial Music Recognition Test scores. T 
of equations used in this analysis is: 


= E; 
Zy? = A; Ex? _ B; Ez? = C; Zyx = D; Dy2 =”? 
ZXZ- F, and 

M - EP -FED 


BC - F2 


N BE -FDE 
BC -p2 


ri" 
The adjusted Σγια in which the dependent V? 


ors 
able is freeg from the influence of both fact 
is then equa] toA -M -~ 


© co" 
ble VT is the analysis of variance ped 
variance table fro the final Music P cic; αν e 
Test scores with both initial Music P gie held 
Scores and initia] Recognition Test score 


(Vol. XXI 


221 


KESTON 


March, 1954) 


Ἴ9Λ9] YT 18 96 9 ‘19491 9/0 18 96 6. 
ο ο ο ο ο ο... “μμ με μμ ων μην μον. ο αμ κ ntfi aa ΗΝ 


L8  Ll'600L- τζοοι- G9'26£'LL 09°62 41 0658 IEZI zit 88 ToL 
eee «ο ο κ r κ σσ eee 


497865 «379686 I Il'STL - 9c 6011- — c8'981'PT  GT'9G c£'v81'6 60716 IG I uooA^jod 
ρθ1ο9[91] €9'6 : 
v6'£Iv Ι9:109 98  90'I629-  96'0£69- e8°SOZE9 αγ εγξ — 99'61€' PL «08606 18 uyy Mm 


stsayjodAy x so1enbg so1vnbg Jp 


poonpoaH 0 pojsnipv 
RM M M ——————— 


LNVISNOO GQ'IXH SWSHOOS LSAL NOLLIN 
-9003H OISQW 'IVLLINI ANV ΡΗΠΟΟΣ LSAL JONAUAAAYd DISAN 'IVLLINI HLOd HLIM 
SNHOOS LSAL q4ONNHNJWHd OISAW TVNIA 30 SHONVINVAOO ANV HONVIHVA AO SISA'IVNV 


IA AIAVL 
19437 O56 18 96 “Sx 
086651 L8 66 ‘6PT ‘T σι '68Ι z Il6'ee6'/1 88 TROL 
Th ΟἹ Th ἜΤ I σος οι τσ οἱ i I uooAjogd 


pejdoooy eL 
06 'SI 609TE‘T 98 TE “SST ‘T cv POT ‘2 oL ZEGT 18 unntA 


stsou1odAH xdi  3θατηρς seimbs p Αχπ ZX ZA Jp 99UTIITA 
ΠΈΘΙΜ Jo ums JO əə mog 


pəənpəy 10 pejsnfpy 


L————————————————————————————————— M Le 


LNVISNOO S4HOOS LSAL NOLLVNIWINOSIG OISAW ΝΟΌΞΒΟ 'IVLLINI ΘΝΙάΊΟΗ 
S"HOOS LSAL NOLLVNIWIHOSIG OISAW NODAYO 'TVNIJ JO ΞΠΟΝΥΙΣΥΛΟΟ ANV AONVIYVA AO SISA'IVNV 


A A'ISVAL 


(Vol. XXII 


JOURNAL OF EXPERIMENTAL EDUCATION 


222 


"TEAST HT FYE 96 ϱ “EAST We ye 96 ‘ey 


SNYLYS οΙΠΟΠΟΟΠ-οἴοος 


JURITFIUBIS 8), 8 

1591, θ0Πθαθ]θαᾷ ISN [enu LAN Τεστ “oT 
JUROTFIUSIC LP IT Θ6ΡΙΘΑΥ Ἰπτοᾶ-ορταΏ 

1391, θοΠθθ]θαᾷ orsn]N [ey Uy LAN [tut "6 
JUBITFIUSIC be L JUSTIONY sduast] [aU] 

1591, 99Πθαθ]θιᾶ dIsnW [erjru] LAN Τατ "9 
JUBTFIUSIC 8h '6 wy ΑΗ 

1591, Θ919191914 orsn]y ΤΈΠΤΙΙ LAW [eur 7 
JUVOTIJIUSIS 09 ‘OT AIOW ASW [tuo 

1391, ϑ0π9191914φ ISN] JENIUI LAW (Eui "ο 
JUBOTJIUSIS 30 ‘9 YoNd 

1591, θοποαθ]θαᾶ orsn]jy [?rjru] LAW [eur ᾿ 
JUBOTFIUSIC P6 ΙΙ 1591, UOTCUIWITIOSIG orsnjy ποσθθαο [vertu 

1391, θοΠθαθ]θ.ᾶ orsn]y ΤΈΠΤΙΙ LAW [vut 'v 
JULIU SIG 9r'e 1591], uorjeuriogju ΤΈοΤ5ηΏΤΛΙ 

1591, θοπο Ιθ]θαᾶ orsn]A JENIUI LAN [eutjg ^e 
juvorgiuSIg Prop 15911, qjueurgsr[duiooo y [tOrSn]A 

1591, ϑ019.1919.14 JISNW [trjtu] LAWN vuta 'z 


JUBOTFIUSIS €9 '6 1591, ποῃπισοσθὰ ISNW JEMU I 
}SƏL ϑ0191919.14 ΟΙ5ΠΙΛΙ JENIUI LAW [tulad τ 


x UOISNTOUOD ΟΠΗ 1 θΙ4ΈΙΙΕΛ JUapusdapuy θἹατειαεΛλ]πθραθᾶθα 
n———————— ——M— GÜlàlÀ ne 


ST'IHVINVA ΙΝΠάΝΠάΠαΝΙ OML HLIM S4HOOS LSAL ΠΟΝΠΗΠωΠΗᾶ OISQNW 'IVNIJ AO 
ΠΟΝΥΙΗΥΛΟΟ ANV ΠΟΝΥΙΗΥΛ JO SASATVNV NAL 10 SNOISA'TONOD ανν SOLLVH ἮΙ JO ΛΗΥΊΛΙΛΝΩ 5 


ΠΛ πα, 


March, 1954) 


Constant. 

m F value obtained was beyond the tabled 

ee ο. the one percent level of significance, 

the ο. was a significant difference between 

veda fo of the two groups even when the effects 

WE 3 influence of the initial Music Prefer- 

ker st and the initial Music Recognition Test 

emoved. 

- In this study, data were collected on fourteen 

Tiables. These are: 


Final Music Preference Test 
Initial Music Preference Test 
Initial Music Recognition Test 
. Final Music Recognition Test 
- Musical Accomplishment Test 
. Musical Information Test 
Initial Oregon Music Discrimination Test 
. Final Oregon Music Discrimination Test 
. Pitch 
10. Tonal Memory 
il, Rhythm 
S Intelligence Quotient 
P? ο πμ Pouf Average 
- 80C10-Economic Status 


oOmDN C» CI! ,» CO N I 


TAS Me ρα analyses of variance and covar- 
ried ont two independent variable were car- 
erence T In each case, the initial Music Pref- 
Constant est and one other variable were he ld 
ence Te E order to free the final Music Pr ef- 
ities in iR Scores of the effects of the inequal- 
ime, T ese various variables, taken two at a 
all F Eu VII is a summary table including 
from th ios found and the conclusions drawn 
In all "à ten analyses of variance and covariance. 
the tab] xg ών. the F ratios were beyond 
of fries values for F at the one percent level 
Sion nt: cance. This substantiates the conclu- 
Preciati we method of instruction in music ap^ 
ic and ts in which students listen both to mus- 
method ace of music is superior to the 
music κ instruction in which they lis ten to 
ased um Further, this conslusion 15 now 
uSic pr ος of the means of the final 
Ve bee elerence Test scores after these means 
the two j adjusted for any differences between 
bineg κ M each of the ten factors com- 
SCOre EE IL initial Music Preference Test 
lance, in the analysis of variance and covar- 


Th 

9 Zero Control Grou 
P 
to p“ Mu 
Scho 
year 
— 


P ο. Preference Test was administered 

ol Cis of 24 students at the beginning ofthe 

a r and again toward the end of the school 
rder to provide a basis for estimating 


3. Th 
e 
tabled value for t with N - 1 or 23 degree 


KESTON 223 


what changes in musical preference may be ex- 
pected from the environmental factors not con- 
trolled in the experimental situation. The ap- 
propriate statistical test in this situation is a 
modified t test, because a high correlation is 
present between the initial and the final scores. 
The obtained tọ value of 3. 48 was beyond the 
tabled value for t at the one percent level of 
significance, and the null hypothesis that there 
is no significant difference between the scores 
at the beginning of the year and at the end of the 
year must be rejected. ? 

The mean of this group on the Music Prefer- 
ence Test at the beginning of, the year was 138.71; 
at the end of the year it had dropped roughly 
six points to 132.50. The latter mean remains 
high and indicates a decided preference for pop- 
ular music in the group as a whole. The mean 
of the experimental group of this study shifted 
approximately sixteen points from a mean of 
19.36 during the year. However, the tg test 
indicates a significant change in the zero 
control group, and this improvement in the mus- 
ical preference οἱ the group may have been the 
result of the activities of the students as mem- 
bers of a high school choir. For this reason, 
the design of this experiment may perhaps have 
been improved by including an additional zero 
control group which did not participate in any 
formal musical activity. 


The Statistical Analysis of the Music Recog- 


nition Test Scores 


i. Test for pooling: As in the case of the 
Music Preference Test scores, it was found 
permissible to group these scores into a single 
experimental group and a single control group. 

2. Test for normality: The two groups were 
subjected to a probit test of normality and found 
to be normal. 

3. The analysis of variance and covariance: 
Table VIII tests the null hypothesis: there is no 
significant difference between the final Music 
Recognition Test scores of the experimental 
group and the control group when the initial 
test scores are held constant. The F ratio of 
8.09 lies within the region of doubt, for it is 
larger than the tabled value for the 5 percent 
level, but smaller than the value for the one per- 
cent level. The null hypothesis may be rejected 
at the 5 percent level of significance but not at 
the 1 percent level of significance. 

Analyses of variance and covariance on the 
Music Recognition Test scores with two inde- 
pendent variables were carried out just as was 
done in the preceding section with the Music 
Preference Test scores. Table IX is the sum- 


s of freedom is 2.807 at the 1 percent levele 


Vol. XXI 
JOURNAL OF EXPERIMENTAL EDUCATION ( 
224 


TABLE VIII 


ON TEST SCORES 
CE AND COVARIANCE OF MUSIC RECOGNITI 
Ems *.—5 MUSIC RECOGNITION TEST SCORES CONSTANT 


ος. σσ 
U ...---- 


Sum of Mean — 

pe al df Στ Zr? Zr,rz df Squares Square F* Hypoth 

a: 

Within 87 4444, 92 2437. 45 2590. 71 86 1691. 32 20.38 εώς Regionon doubt 
Between 1 369. 88 56.15 144.11 l 124.11 124.11 


Total 88 4814. 80 2493.60 2734 82 97 1816. 43 
*3. 96 at 5% level; 6. 96 at 1% level. 


TABLE Ix 


CE 
SUMMARY oF F RATIOS AND CONCLUSIONS OF TEN ANALYSES OF VARIANCE AND ip ee EN 
OF FINAL MUSIC RECOGNITION TEST SCORES WITH TWO INDEPENDENT VARIABL 


Dependent Variable Independent Variable F Ratio Conclusion 
1. Final MRT 


Initial Music Recognition Test bt 
Initial Music Preference Test 4.74 Region of doul 
2. Final MRT 


Initial Music Recognition Test ignificant 
Musical Accomplishment Tene 2.81 Not signific 
3. Final MRT Initial Music Recognition Test t 
Musical Information Test 5.32 Region of doub 
4. Final MRT Initial Musi 


ο Recognition Test 


Initial Oregon Music Discrimination Test 8.41 Significant 
5. Final MRT Initial Music Recognition Test pt 
Pitch 4.62 Region of dou! 
6. Final MRT Initial Music Recognition Test 
Tonal Memory 8.74 Significant 
7. Final MRT Initial Music Recognition Test t 
Rhythm 1.24 Not significan 
8. Final MRT Initial Musi 


€ Recognition Test 
Intelligence Quotient 


t 
6.53 Region of doub 
9. Final MRT Initial Music Recognition Test 
Grade-Point Average 7.27 Significant 
10. Final MRT Initial Music Recognition Test 
Socio-Economic Status 


t 
5.92 Region of doub 


225 


KESTON 


March, 1954) 


(dnox5 qo11u02) 
LAN Deut 


v0" [44 
££" τα 
Ie" τα 
στ 96᾽ 
gO" 6c 
TE. 663 
86 e 
[42 τε 
oF" 9S 
Ie" 66 
LM 96 
SP 9p" 


(ἄποςαΏ 0130075) (άποαρ '1edxa) (dnory ᾿αϑάχη) 


LAN peur LAN peut 


Ld HMI 


Snjvjg οπΠοΙοοῇΠ-οἴοος 
Θ98άΘΑΥ JUIOd-apeiy 
1αθηοτὸΌ eouoeStr[[o3uT 
unuy 
Aroway{ TEUO.T, 
youd 
1591, uonvurumnio 
-514 ISN U03310 Peura 
1591, uonvrumio 
-SIQ ISN πο3910 penu 
uorneurrogug TENSON 
queun[sr[duioo9y [eotsn]N 
LAN [eund 
LUN [enr 


AGR.LS SHL NI GNNOd SLNSIOLIJ3OO NOLLV'I3HHOO 


X WwISVIL 


228 JOURNAL OF EXPERIMENTAL EDUCATION 
$ 


mary table of all F ratios found and the result- 
i usions. 
a ratios of Table IX do not permit the 
Sweeping generalization which was possible from 
the F ratios of Table VII. In the analyses using 
the Music Recognition Test scores, five of the 
ten F ratios fall within the region of doubt and 
two others are clearly not Significant. This find- 
ing, however, does not detract from the import- 
ant conclusion of the study. The basis of music 
appreciation is a relevant response to music, 
and this response involves value judgments. The 
knowledge of facts about music such as the name 
of the composition or the composer is secondary 
and relatively unimportant. Gernet remarks in 
his study that the ability to recognize music is 
not important in the appreciation of music. (2) 
The analyses of the Music Reco gnition Test 


ts between pairs of var- 
dy were calculated. 


of the correlation coef- 
ficients found in this study. 


Summary and Conclusions 


This Study was condu 
an experimental basis f 


£roup, consisted of 
cal music without c 


(Vol. XXII 


rhythm, I.Q., grade-point average, and socio- 
economic status. The statistical tool utilized in 
the analysis of the data was analysis of variance 
and covariance. This analysis performed on the 
final Music Preference Test scores with the in " 
itial Music Preference Test scores held constan 
revealed that there was a significant difference 
between the means of the experimental and con 
trol groups when the final scores were adjuste 
or freed from the effects of the initial Music s 
Preference Test scores. Ten additional rons 
Ses of variance and covariance were pd 
on the final Music Preference Test scores wit 
both the initial Music Preference Test SERTER 
and each of the ten independent variables hel có 
Constant. In every case, a significant ο... 
was found between the means of the exper ur 
al and control groups on the final Music Pres 
ence Test scores after the necessary adjust 
ments were made, ot 
The educational implication of the results 
these analyses indicates the superiority of : 
method used in teaching the experimentalgro A 
The final conclusion of the study, therefore, E 
that the method of instruction in music apprec a 
ation which utilizes commentary and discussio 
aimed to develop appreciation in conjunction - 
with listening to music is superior to the me 


ja ith- 
of instruction in which music is listened to Wi 
out comment, 


REFERENCES 


1. Fisher, Ronald A., and Yates, Frank. A 
tistical Tables for Biological, Agricul! er 
Publishi i 

ng Co., 1948), i12 DP. ^. don 

2. Gernet, Sterling K. Musical Discrimination 

at Various Age and Grade Levels (Colle 


Place, Washington: The College Press; | 
1940), 160 pp. 
3. Jackson, Ro 


Studies on 


FUNCTIONAL COMPETENCE IN 
MATHEMATICS 


G. DON ALKIRE 
Fresno State College 
Fresno, California 


"s Κα ENT years the mathematics pro- 
"peek ci c Mp i σῷ, subject- 
technolo; quent criticism. Meanwhile, in our 
cal Ment a Society, the need for mathemati- 
empt to iency has been expanding. Inan at- 
irst ste resolve some of the issues raised, a 
ysis ος τὸ might well be the definition and anal- 
or caref ncs situation. Need is evident 
Search apiid conducted and comprehensive re- 
ertaken A area. The present study was un- 
6 sati ith the intention of contributing toward 
isfaction of that need. 


8 
tatement of the Problem 


existi blem was to determine some charac- 

akota’s ς the mathematics program in South 
Year of ou schools during the academic 
tion of fun 1952, and to investigate the rela- 
Certain fa, μη» competence in mathematics to 
and in the Loro enn the pupil, inthe school, 
DOSite of ο. . The study dealt withthe com- 

€ Seconda the mathematics courses taken in 
functional ry school and was concerned with 
ents in thi competence in mathematics of stud- 

Ὁ fourth year of high school. 


Delimitat; 
Mmitations of the Problem 


Th : 
Cure ad Mary purpose of the study was to 59” 
the follo istically verified evidence concerning 
wing questions: 


— 
Pupil ber of certain factors resident in the 
Competen ignificantly related to his functional 
Means of = in mathematics? Specifically, when 
Adjusted [ο C tional-competence test scores are 
ation τ Q's) inequalities in mental scores (devi- 
: Q’s) among the groups under considera- 


tion 
do R 
; > es the adjusted functional-competence 


mean 


a. of ei 
- OF M Sex exceed that of the other sex? 
pils who received their arithmetic 


ES 
er ATY of the Relation of Certain Factors Resident in the 
Jished doctoral dissertation, University of Kane 


F 
Sas, T Unctional Competence in Mathematics, unpub. E 
School of Education, University of Kansas. 


T, to 
5 = 
* Advisor: Kenneth E. Anderson, Deam, 


training in rural schools differ significant- 
ly from that of pupils who received their 
arithemetic training in urban schools? 

c. of pupils who plan to attend college differ 
significantly from that of pupils who have 
not formed such plans? 

d. of pupils who have had one or two years 
of mathematics in high school differ sig- 
nificantly from that of pupils who have had 
more than two years of mathematics in 
high school? 

e. of pupils whose grade-point average in 
mathematics courses places them in the 
lower one-fourth of the distribution of 
grade-point averages in mathematics dif- 
fer significantly from that of pupils whose 
grade-point average in mathematics places 
them in the upper one-fourth? 

f. of pupils whose rank places them in the 
lower one-fourth of the graduating class 
differ significantly from that of pupils whose 
rank places them in the upper one-fourth? 


2. Which of certain factors resident in the 
school are significantly related to the function- 
al competence in mathematics of its pupils? In 
particular, when means of functional-competent 
test scores are adjusted for inequalities inment- 
alscores among the groups under consideration, 
are there any significant differences in adjusted 
functional-competence means among: 


a. schools enrolling less than 100 pupils, 
those enrolling from 100 to 500 pupils, 
and those enrolling 500 or more? 

b. schools belonging to school districts the 
assessed valuation of which places them 
in the upper one-fourth of the distribution 
of schools classified on the basis of asses- 
sed valuation of school districts, and 
schools belonging to school districts the 
assessed valuation of which places them 
in the lower one-fourth? 


il, in the School, and in the Teach- 


228 JOURNAL OF EXPERIMENTAL EDUCATION 


3. When the mean functional-competence 
Scores are adjusted for inequalities in mental 
Scores among the groups under comparison, are 
there any significant differences among the ad- 
justed functional-competence means of p upils 
attending schools, the average T-score* of 
whose teachers places the schools in the upper 
one-fourth of the distribution of schools ranked 
according to T-scores, and the adjusted func- 
tional-competence means of pupils attending 
Schools, the average T-score of whose teachers 


places the schools in the lower one-fourth of the 
distribution? 


Selection of the Sample 


ely ten percent 
Secondary schools in South Dakota. ΡΝ 


“year schools and one 6-3-3 


Data~Gathering Instruments 


Data were obtained from the thirty schools 
in the sample as follows: 


participating in the 
of a schedule filled out by each pu 


t 
4 ha 
gether with other information, indicated t 
* Each teacher was given a T-score in te. 
of number of semester hours of mathematics of years of teaching 


used in the study is the mean of those two T. 


nerned in higher institutio. 
=SCOTES, 


2. Data concerning the length of employment, 
the experience in teaching mathematics, and 3 
the academic qualifications of the teachers pes 
obtained from a schedule which was filled ou í 
by the principal or superintendent of the Eee 

3. Data concerning the mathematics ἈΝΉΡ 
offered, the units of credit for eachcourse, an 
the rank of each pupil in the graduating C oe 
were obtained from a schedule which was fille 


out by the superintendent or principal of the 
school, 


d- 
4. Mathematics test Scores in terms of stan 


ard scores were obtained from the Davis Test 
of Functional Competence in Mathematics. s 

5. Mental test scores in terms of deviation 
I. Q.’s were obtained from the Terman-McNem 
Test of Mental Ability, 


The Examinations 
aie kxaminations 


The mental test and mathematics test zn 
well known standardized tests, properly bio re^ 
dated and shown to possess a high degree idence 
liability. One can have considerable confi 
that the examinations used were measuring" 
objectives for which they were designed, à 
were doing it Consistently, ure 

The Davis test was constructed to pei 
functional competence in grades 9 through tials 
and consists of 80 items based on the eae 
for functional competence in mathematics 
outlined by the Commission of Post-War ns 
of the National Council of Teachers of Mat 
matics. The Commission defined function 
competence in mathematics by a check lis urn 
29 items and stated that the school might nt in 
Out students that are functionally competen ean- 
mathematics if the program were built sub reas 
tially on abilities and outcomes in certain ree 
Some eminent mathematics educators e 
garded the Commission's report as the ΠΙΟΣ th- 
authoritative statement of the objectives 1 
ematics instruction at the secondary leve™ 


Distribution of Scores 
------οη of Scores 


s 
„The distribution of deviation I. Q. ’S “phe 
Slightly skeweq and slightly leptokuric. jon aid 
Chi-square test showed thet the distributi igtri" 
not depart Significa, score? 


8 


Y from normality, Histograms 


Superimposed normal curves of best fit, t° 


experience and a T-score in $977? 
15. The average T-score 


De 


| 


March, 1954) 


ALKIRE 


TABLE I 


DISTRIBUTION OF FIGURES PERTAINING TO ENROLLMENT AND 
ADMINISTRATIVE ORGANIZATION OF THE 298 SECONDARY 
SCHOOLS IN SOUTH DAKOTA DURING 1951-52 


Classification of School: Number Percentage 
Enrollment Organization in State of Total 
Less than 6-3-3 2 . 67 

100 4-year 203 68. 12 
pupils 6-6 Bi 2. 6 
213 71. 46 

From 100 6-3-3 3 1.01 
to 500 4-year 73 24.50 
pupils 6-6 -3 1.01 
79 26. 52 

500 or 6-3-3 4 1.34 
more 4-year 1 «834 
upils 6-6 i 34 
298 100. 00 


Grand Total 


SUDRA o n LLL 


229 


JOURNAL OF EXPERIMENTAL EDUCATION 


230 


6- NHU 625-3 
1d909V 68° εβζοβ '8GvI 6551, 0876), 68918 “LES LIP TPL 


90° d 

99669 `Z 0008 004 L6869 0 G 0€ 
COEED `Z θασα "657 Ῥογαθ Ὁ 6 86 
LTP99 °Z 000S 'I97 60506 Ὁ 8 L6 
9946€ '€ 000} '8655ζ 00000 Τ OT 96 
TLPEL “ο 6888 “CPSs bCPSE 0 6 Gg 
616Ρ8ζ 66658 “969 ST8LL Ὁ 9 Τσ 
14916 °g 4916 '868 81640 'T GI £c 
64660 6 9866 'c80T &£I9PI'I PI 66 
68E9C `E 0001 “9E8T 00000 ΙΤ OT Ic 
90648 `g 0000 '06,, 00000 `T OT 02 
G08€F 5 0518 Τις 60Ε06 Ὁ 8 6I 
8.868 `Z 0000 ζ89 46869 '0 G 8T 
66111 6 0000 Ῥ621 00000 `T OT LT 
66178 σ 0518 569 60506 Ὁ 8 9Τ 
£6806 °g 0008 σε 46869 Ὁ G GT 
509} `E 6691, '866c P6ETT Ἵ 6] PI 
GOLPE `Z 0008 "ες 46869 Ὁ G £I 
£60F6 `g 5668 'c98 GI8AL 0 9 GI 
619566 '6 GGGG "EPs PCPSE 0 6 LI 
vGG8p 6 ELEL 8006 66170 'I HH OT 
σόσσε 'C 0000 ‘0T2 90509 '0 v 6 
46888 5 0000 ΤΡ, S669 "I GP 8 
[44:2 6 EEEE '6ccG T6088 'I vc L 
692LG'E SLEP 8526 GIpOG'T 9T 9 
15079 ε 9LL6 '89ΕΡ LESS? "T 8T G 
80880, "E GGGG 'TPT9 0€9S6 'I 96 Ῥ 
1166Ε 6 8181 '0TGcC 66170 'I IT 5 
O8P8L “6 PST9 'c609 ἀ6ΡΤΥ ‘T 96 G 

67902 Ῥ CV9T 18091 L40928 'T L9 I 


“dA UTI 8 Soy -Su 8 So[ 0 Su 80] . Su Su 30] Su dnoiry 


V-II NOSIHVdWOO “GNNOd ΠΌΠΊΠΟΟ ‘ONITOOd ΠΟ SHONVIHNVA JO ALIANADOWOH LSAL 


II ΠΠΗ͂ΥῚ, 


291 


ALKIRE 


March, 1954) 


EE τω M 


1doooVy 666° Z6IIS°GLIZ £vP0€e6'P  6060'961e8 X vPOCCG ΙΙΙ οσο Οὐ 191011, 
c0 «d 

c0629'P v9£8'T9GCP I9c69e'z gpg punog eS9e[[0D-UON 

TNN 81,669 9997 '9€9cv SO9SE'Z Lz punog 2391109 

{4Η uy 50 307 Su Sg 39ο] 30 Su 9ο] Su Su SO] Su sdno15 


es O0©@—©®$™<™—S—_{—=™—=™— 


V-II NOSIHVdINOO ‘SHONVIUVA JO ALISNADOWOH HO LSAL 


AI ΠΤαάΝ1, 


"91591 Se^ 
‘Sugo oq] uooAjod θοπθαθΙΠρ JULITFIUSIS OU sea 9191] FEY} 'srseujodÁu IMU eux 


0£-82-L2-97-77-10-0€0-61-9T-TPI-gI-TI-01-9-G-v-£-6 :pepn[out 5190195 


ἹὭἃ  -ἵ €———— rr — a€——— M ——————— M — M ——À 


GGGc 96057 966 TAOL 
V g0'«d 67 Ὁ c0 86 ZLOS 9991 LT sdnoiy uooaM^jeg 
€0 961 Ep96 6960F 60€ sdnoJr) unn 
ο... μυ σος ωμή. μαμα. 
XH  Απιαταοαᾶ A o1enbg so1unbg jp UOH?LIEA 
ΠΈΘΙΜΝ jo wns JO ƏJINOQ 


ee 
————————M————————————————————————————— 


GNNOd ADATIOO ‘V-M NOSIHVdINOO HOÀ 'DNI' ITOOd HOA SNVAW AO 
ALIANADOWOH AHL ANNWYALAA OL S4HOOS SIAVG AHL AO ΠΟΝΨΙΗΥΛ AO SISA'IVNV 


II 4'ISV.L 


232 JOURNAL OF EXPERIMENTAL EDUCATION 


each distribution was uni-modal and exhibited 
only a very slight departure from norma lity. 
Fisher (3) has shown that for curves that exhib- 
it only a moderate departure from normality, 
the efficiency of certain statistical techniques 
remains reasonably high. Also, in using the 
analysis of variance and covariance, the as- 
sumptions basit to this technique were tested. 
And for the t-test and F-test used in the analy- 
Sis, Cochran (2) has shown that no Serious er- 
ror for a slight departure from normality is in- 
troduced in the significance levels. 


The Comparisons and Statistical Tools 
Employed 


In this study there were nine major compar- 
isons, each consisting, in most cases, of two 
or three minor comparisons. Using the te ch- 


employed, the writer turned to the Behre ns- 
Fisher d-test, The a. 


Analysis of Variance and Covariance 


In the first group (college-bound) there were 
417 pupils representing twen 


ing the thirty schools, 

Before the 
2nd mathematics test score 
two groups could be pooled, 
basic to Pooling had to be fulfilled: 


1. that there was no difference between the 
groups to be pooled i 


n regard to standard devi- 
ations; 
2. that there was no 


difference between the 
groups to be pooled in r 


egard to means, 


The first assumption was tested by use ofthe 
Welch-Nayer test on the “sum of Squares within 
groups.’’ The value for L, was obtained from 
the formula 


log L, =log N- ᾗ Σποίοξ ne «1 zng log 6s - log(= ϐ) 


$ 


and the corresponding probability was found in 
Nayer's tables. The second assumption was à 
tested by use of the F-test, in which F was € 
by dividing the mean Square between the e 
by the mean Square within the groups. Enteri g 
Snedecor's table, the probability corresponding 
to a given value of F was obtained. The d 
of F as found by the analysis of variance tec 
nique assumes equality of variances of the 
groups involved. This equality was tested by 
the previously mentioned Welch-Nayer test. Ν 
As shown in Table II, the twenty-nine sub 
£roups of pupils who were college-bound yere 
homogeneous with regard to variances. It wa 
found that the L, was greater than the poe 
found in Nayer’s tables at the 5 percent leve x 
The null hypothesis, that there was no "pe d 
cant difference between the groups to be pon ^ 
in regard to Standard deviations, was accep A 
Not all of the subgroups allowed themselv 
to be pooled on the basis of equality of pon 
As shown in Table III in which eighteen of t 2 
twenty-nine Subgroups were considered, an 
Οἱ 0. 49 was obtained. Entering Snedecor's 


The null hy- 
pothesis, that there was no significant differ be 
ence between the means of the subgroups to S 
Pooled, was accepted. These eighteen group 
also satisfied the L, test, to 
The eighteen subgroups who had planned ja 
attend college, having satisfied the two ud 
necessary to pooling, were pooled into one £7 
(college~bound, Group ΙΠ-Α). ine 
In an exactly similar manner, twenty nue 
of the thirty Subgroups of pupils who had n p 
planned to attend College, having satisfied le 
two criteria necessary to pooling, were poo " 
into one group (non-college-bound, Group oie 
We were now ina position to test the hyp pat 
esis that there was no Significant difference. ea 
tween the two Pooled groups with regard to ate 
on the mathematics test scores, holding inte 
gence test scores Constant. The assumption? 
which had to be satisfied before the analysis 


variance and Covariance tool could be applied 
were: 


1. that there was n 


Each of these ass 
Welch-Nayer test, 
ploying the “sum of 


ing the mathematics 
was tested 


umptions was tested by id 
The first was tested by ee 
Squares within groups d 
test scores. The Seco 6 
by using the adjusted «sum of sque 


(Vol. XXII 


Γ ἘΠῚ 


March, 1954) ALKIRE 
TABLE V 
y 
TEST FOR HOMOGENEITY OF REGRESSION, COMPARISON III-A 
e log 08 ng log 65 Li Hypothesis 
30958. 9907 4. 49077 Null 
29230. 9094 4. 46584 
P>.05 
60189. 9001 4.77952 2104. 60391 . 998 Accept 


K-2 H.M. -230 


TABLE VI 


DATA FOR ANALYSIS OF VARIANCE AND COVARIANCE, COMPARISON II-A 


G N ex xy y 


roups 

College Bound 227 30933. 0749 19005. 6167 42636. 2555 

Non-College Bound 243 38159. 7613 22554. 5350 42561. 8354 
Z . 69092. 8362 41560. 1517 85198. 0909 

Total 470 83351. 0234 55343. 0042 98521. 4553 


“ΠΠ 


TABLE VH 


ADJUSTMENT TABLE FOR ANALYSIS OF COVARIANCE, 
COMPARISON III-A 


a MÁ((— m — en 


Groups Correction Adjusted θε 
College Bound 11677. 2648 30958. 9907 
Non-College Bound 13330. 9260 29230. 9094 
Σ 60189. 9001 
24998. 7495 60199. 3414 

Total 36746. 3815 61775. 0738 


MNT ο ως. 


233 


JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE VIII 


ANALYSIS OF VARIANCE AND COVARIANCE OF DAVIS SCORES 


WITH MENTAL SCORE CONSTANT, COMPARISON III-A 


d. f£, S.S. M.S, F Hypothesis 
467 60199. 3414 128. 9065 Null 
1 1575. 7324 1575. 7324 19. 2238 
PS 01 
468 61775. 0738 Reject 


Rcs NN ....... 


TABLE IX 


ADJUSTED DAVIS MEANS, COMPARISON ΠΙ-Α 


(Vol. XXII 


u Diff. of X 
Groups N X Y X Y  fromG.M. Corr. 
College Bound 227 24422 26773 107.59 117.94 -5. 70 . 602 


Non-College Bound 243 23465 26071 96.56 107.29 5.33 


Total 470 47887 101. 89 


Adjusted 
Means 


114.51 


110.50 


——— ve — 


i ~. 


mE —-~ 


—— wx 


March, 1954) 


ον. oh " The adjustment took into ac- 
the he effect of differences in intelligence on 
mathematics test scores. 
mas she in Table IV, college-bound Group 
bound oed of 227 pupils, and non-college 
Were t roup III-A, comprised of 243 pupils, 
τα nt . for homogeneity of variances. The 
the 5 9 was greater than the table value at 
T ME level. As shown in Table V, when 
geneit orea groups were tested for homo- 
ed an lt regression coefficients, there result- 
value ο . 998 which was greater thanthetable 
sary to t] 65 percent level. The data neces- 
Tables "i. analysis and adjustment are foundin 
pooled I and ΥΠ. It was concluded that the two 
ο the dero satisfied the assumptions basic 
Varia pplication of the technique of analysis of 
W nce and covariance, 

and f πο now in a position to analyze results 
roups ermine the F ratio for the two pooled 

- As shown in Table VII, an F of 12.22 


Wa, : 
Value tined which is greater than the ta ble 
«Ue at the 1 percent level. The null hypoth- 


eee rejected, and it was concluded that 
ο Boole, a significant difference between the 
athemati groups with regard to means on the 
Bea ics test scores, holding intelligence 
“ores constant. 
functiona Pooled group was significantly more 
intellige ly competent in mathematics, holding 
Pr ced: constant? By applying ἃ correction, 
Pooleq ed Davis mean was obtained for each 
ea group. As shown in Table IX, the Davis 
Tom i the college-bound group was adjusted 
for the of 94 to 114.51 whereas the Davis mean 
110,56 other group was adjusted from 107. 29 to 
ence” i üfference in adjusted means of 4.01. 
Pupils’ it was concluded that on the average, 
Signifie? had planned to attend college were 
mathem. ntly more functionally competent in 
ated atics than were pupils who had not form- 
Such plans, 
Sui 


bs e, 
€quent Tests Employed 


Int 
Variance event assumption of homogeneity of 
?mployoq of subgroups was not met, the writer 
the Behrens-Fisher d-test: 


dz Y-Y 


---..... $ GES ---------- 


ΣΣ, -¥,)? , Z(Y, -Ya 
NiNi-1) Νε(Ν;-ῃ) 

Where 

letter the type of quantities represented by the 
ls -ᾱ- Obvious. Sukhatme's table of d was 

& en os the significance levels. 

arOups we 9 groups selected from a number of 
the to e re being compared, it became neces" 
he ,., ΡΙΟΥ a t-test, Wishart's adaptation of 


cta 
St and the common t-test described by 


ALKIRE 


Kenney and Keeping (6), the one used in the 
study, are equivalent when only one variable 
is held constant. The foliowing formula was 
used to compute the standard error of a mean 
difference: 


S.E Je σι) ΣΣ. - ¥2)? . (N, + Να) 
λα N,4N2-2 ΝΙΝ» 


As Anderson (1) points out, caution must be ex- 
ercised in interpretation due to the higher sig- 

nificance levels imposed when means of select- 
ed samples are being compared. 


The Findings 


Not allof the minor comparisons were in 
complete agreement with the respective major 
comparisons. However, on the whole, it would 
seem reasonable to make the following general- 
izations from sample to population. On the av- 
erage a pupil was significantly more function- 
ally competent in mathematics if: 


1. The pupil were a boy. 

2. The pupil had taken his elementary arithme- 
tic training in rural schools rather than in 
urban schools. 

3. The pupil had planned to attend college rather 
than to terminate his academic trainingatthe 
end of the twelfth grade. 

4. The pupil had taken more than two years of 
mathematics in high school rather than two 
or less years of mathematics. 

5. The pupil’s grade-point average in mathe- 
matics placed him in the upper one-fourth of 
the distribution of pupils on the basis of grade~ 
point average in mathematics rather than in 
the lower one-fourth. 

6. The pupil’s academic record in high school 
ranked him in the upper one fourth of his 
graduating class rather than in the lower one- 
fourth. 

7. The pupil were ina school enrolling more 
than 500 pupils rather than in one enrolling 
either less than 100 or between 100 and 500. 

8. The pupil were in a school, the assessed val- 
uation of whose school district placed the 
school in the upper one-fourth of the distri- 
bution of schools on the basis of assessed 
valuation of school district rather than in 
the lower one-fourth, 

9. The pupil were in a school, the average Te 
score (a number which took into account 
both the number of years of teaching exper- 
ience and the number of semester hours of 
mathematics preparation in higher institu- 
tions) of whose mathematics teachers placed 


296 , 


the school in the upper one-fourth of the dis- 
tribution of schools on the basis of average 
T-score rather than in the lower one-fourth. 


The following measures were calculatedfrom 
the examination scores of all the pupils in the 
sample: 


1. Mean I. Q., 103. 28; Standard Deviation 13. 
(Norm: 105; 15) 

2. Mean Davis Score 114.02; S. D. 18. (Norm: 
116; 16) 

3. Coefficient of correlation between intelligence 


and functional competence in mathematics, 
.61. 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. XXII 


Recommendations 


It is recommended that results of research 
be made available to those directly concerned 
with their utilization, to the end that boys and 
girls may be more adequately trained in mathe- 
matics, not only for the precise purpose of train- 
ing mathematicians, but with the broader objec- 
tive of opening new worlds of thought and en- 
deavor to the layman and citizen of tomorrow. 

It is further recommended that grants be made 
by governmental and private agencies to encour^ 
age further research of an experimental nature 
necessary to establish mathematics instruction. 
upon a scientific basis and assure its continuous 
improvement. 


BIBLIOGRAPHY 


1. Anderson, Kenneth E. “A Frontal Attack 
on the Basic Problem in Evaluation: The 
Achievement of Instruction in Specific Ar- 
eas, " Journal of Experimental Education, 
XVII (March 1950), pp. 163-174. 

2. Cochran, W. G. “Some Consequences when 
the Assumptions for the Analysis of Vari- 
ance are not Satisfied, ” Biometrics, 360 
(March 1947), pp. 22-28. 

3. Fisher, R. A. “On the Mathematical Foun- 
dations of Theoretical Statistics,» Philo- 
sophical Transactions of the Royal Societ 
of London, A, 222 (1922), pp. 309-368, 

4. Fisher, R. A. ‘The Comparison ofSamples 
with Possibly Unequal Variances, " Annals 
of Eugenics, IX (1939), pp. 174. 

5. Johnson, Palmer O. Statistical Methods in 
Research (New York: Prentice Hall, 1949). 

6. Kenney, J. F. and Keeping, E. S. Mathe- 
matics of Statistics, Part Two (New York: 
D. Van Nostrand Co., Inc.), p. 178. 

7. National Council of Teachers of Mathematics, 

i Commission on Post-War Plans, ‘‘Final 
Report, ’’ Mathematics Teacher (Novem- 

41). 
8 Μι E A N. ‘‘An Investigation into the 

' “Application of Neyman and Pearson's L, 


Test, with Tables of Percentage Limits, 
Statistical Research Memoirs, I (1936), 
. 98. 

9. rio J. G. and Schafer, R. “A Table 
of Random Numbers from Selective Ser 
vice Numbers, ” Journal of Psychology, 
XIV, pp. 296-297. "T 

10. Schunert, Jim. ‘The Association of Ma t 
ematical Achievement with Certain Fac 


ἢ h- 
tors Resident in the Teacher, in the Teac 


ing, in the Pupil, and in the School, "' 
Journal of Experimental Education, XIX 
(March 1951), pp. 219-238. "m 
11. Snedecor, G. W. Statistical Methods, 
edition (Ames, Iowa: Iowa State College 
Press , 1946). 1- 
12. South Dakota Educational Directory, 19517 _ 
1952 (Pierre, South Dakota: State Depar 
ment of Education, 1952). S 
13. Sukhatme, P. V. ‘On the Fisher-Behrer 
Test of Significance for the Differen 
Means of Two Normal Samples, ’’ SankhY2» 
IV (1938), p. 39. 
14. Wishart, J. C. “Tests of Significance in ai 
Analysis of Covariance, ” Journal of RO 


Statistical Society Supplement, III (1932), 
79-82. 


pp. 


<] 


AN APPLICATION OF THE FERGUSON 
METHOD OF COMPUTING ITEM 
CONFORMITY AND PERSON 
CONFORMITY 


H. M. FOWLER 
Ontario College of Education 
Toronto, Canada 


as D THIS paper, item conformity is defined 
dct d product-moment correlation between the 
Pm answer pattern of the item andthe ‘ideal’ 
πας, pattern, which is a function of the distri- 
inii of the total scores on the test. Person 
i5 onc which is similarly defined, refers 
spo e relationship between the individual’s re- 
eet) to the test items and the responses of 
"ed. aie asa whole. Item conformities and 
of Wii conformities may be computed by means 
Sore i has been called the ‘‘Ferguson Method’’. 
Hons "neon aspects and practical applica- 
is of the Ferguson Method are discussed in 
prende η To illustrate possible uses of this 
stud dure, a report is given of an e mpirical 
umi which the method was used. The data 
ert in the analysis were obtained by adminis- 
Cabub. an experimental edition of a 74-item vo- 
stud ur test to a group of grade 8 pupils. To 
pers e effect of eliminating non-conforming 
Töm t. item conformities are first computed 
το-ρο he total sample of 100 persons and then 
Sons mputed with the 18 least-conforming per- 
ing Plate Se Similarly, the effect of exclud- 
puti on-conforming items is studied by first com- 
ne person conformities for all items and 
orm, Computing them with the 23 least con- 
oth ing items removed. The effect of removing 
ee conforming persons and non-conform- 
Demo is also studied. Since the sample of 
and si used in the empirical study was small, 
ure "c the test items were designed to meas- 
ions y one type of achievement, the conclus- 
must be considered tentative and particular. 


1. Item Validi 


Iti 
t is standard procedure in test construction 


0; 
mtario College of Education, University of Toronto. 


to compute the difficulty and the validity of the 
items used in the trial runs. Difficulty is ordin- 
arily defined as the percentage of a specified 
group of students who get the item correct (or 
incorrect, if preferred). Validity has been var- 
iously defined. As a consequence, there are a 
number of ways of obtaining estimates of valid- 
ity. At the Department of Educational Research! 
during the construction of the first experiment- 
al edition of an achievement test, we get an es- 
timate of curricular validity by comparing the 
item content with the course of study aSlaid 
down by the Province of Ontario. Statistical es- 
timates of item validity can be obtained later 
either by comparing the item scores with the 
scores onsome criterion outside the test, or 
with the total scores of the test. It is usually 
necessary to use the total scores as the item 
criterion since reliable extra-test criterion 
measures are not often available. Some refer 
to the correlation between the item and the total 
test score as the ‘‘validity’’ of the item, but it 
seems to be more appropriate to think of itas 
the ‘‘conformity’’ of the item since it is a meas- 
ure of how well the item fits in with the total 
group of items.2 Securing item conformity will 
not necessarily lead to high test validity if test 
validity is defined in terms of correspondence 
with an outside criterion. Nevertheless, most 
test constructors, whether by choice or not, 
look for item conformity first because item con- 
formity appears to be a prerequisite of test 
reliability and without test reliability no test 
validity is possible. 

There are a number of statistical procedures 
for computing estimates of item conformity. 
Guttman has introduced a technique for scaling 
items.3 Loevinger has proposed still another 


It 
9m conformity is an indication of the discriminating power of the item, the ability of the 


t 
a 0m to separate the sheep from the goats. A good item is one which is passed by most of the 


ΕοοᾶΏ students, as determined by the total test, 


Lo 
an Guttman. "The Cornell Technique for Scale 
Shologicel Measurement, VII (Summer 1947). PP- 


and failed by most of the "poor! students. 


and Intensity Analysis," g 
247-279. yeis," Bducational and Pey 


— 


2 — t M 


238 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXII 


ΜΙ 
<= 


approach to test construction. 4 In the Depart- That Rja is the product-moment correlation be- 

ment of Research to get item conformity, we tween scores on the actual answer pattern and 

compute the correlation between the test item Scores on the best answer pattern may be dem- 

and the total test score by means of what we onstrated as follows: 

call the ‘‘Ferguson method".9 Readers will 

recognize that the Ferguson method of item an- Let X denote a score on the actual answer 

alysis is similar to Guttman's scaling methods pattern and Y denote a score on the ‘‘ideal’’ 

and to the procedures discussed by Loevinger. answer pattern 

The Ferguson method, although not widely dis- 

cussed in the literature, preceded the other Then: ΣΧ = ΣΧ’ = ΣΥ = ZY? =P (summing 

methods. for the N individ- | 

uals) " 

2. Item Conformity: Ferguson Method and | ZXY = P -W Y 
An estimate of the conformity of a test item NOW ryy- οσο πε 

that is scored as one for right andzerofor wrong V [ΝΣΧ: - (ZX)][NZY*? - (2 

may be obtained by computing the product-mo- 

ment correlation between the actual pattern of N(P - W) - Ρ2 


response for the item and the pattern which 
ranks the persons in the same order as they are 
ranked by the total test score. Consider Figure 
A 

The correlation between the scores of the ac- 


V (NP - P2)(NP - P2) 


P(N - P) - NW 


; P(N - P) 
tual answer pattern and those of the **ideal" 
answer pattern is given by the following form- 
ula: i wl, 
= =Q—NW since N- P =Q y 
Ria = PQ- NW Q 


PQ 


Ria, the conformity of the item. 
where P = the number of persons getting the 


of dichotomy who get the item wrong. In the item analysis given in Figure 2, indi- 


viduals appear in the columns, items in the 


item right 
Q = the number of persons getting the The method just described may be used to 
item wrong analyse the items of a test. Figure 2 showS 
N = the total number of persons =P + Q part of an item analysis of a 33-item arithmetic 
W = the number of ‘‘misplaced’”’ persons, test, obtained by analysing the responses give? 
which is given by either the number on 100 test booklets which were selected so 38 
of persons below the point of dichot- to be a representative sample of a larger sample 
omy 6 who get the item right or the of 1500 test booklets. Each item had five ΟΡ” 
number of persons above the point tions. f. 
In this example, P - 13 Q=5 rows. The initial work consists of tallying the ' 
N=i8 W=2 responses made by each individual on each item 
of the test, where the individuals are arranged 
Therefore, Hija = (13)(5) - (18)2) - 29 . . 45 in descending order of the total score from left 


(13)(5) 65 to right. If the response is correct, the square 


is left blank; otherwise, a number from one ἵο 
4, Jane Loevinger. "A Systematic Approach to the Construction and Evaluation of Teste of Ability" 


Psychological Monographs, LXI (1947), pp. 1-49. 


—.. "The Technic of Homogeneous Tests Compared with Some Aspects of Scale Anal- 
ysis and Factor Analysis," Peychological Bulletin, XLV (November 1948), pp. 507-529. 


5, G@ A Ferguson. The Reliability of Mental Tests (London: University of London Press, 1942). 


The point of dichotomy of an item 18 obtained by counting from right to left the number of 

a uaren indicated by Q; for example, in Figure 1, the point of dichotomy is the point which 
rites the 18 squares into two parte so that 13 squares are on the left and 5 Squares are 
on the right of that point. 


plank square indicates a correct item, which is scored "1"; in Figure 1, the scores of the 
7, Α Θ 
items were ghown. 


299 


FOWLER 


March, 1954) 


"11911 απ[ποηατά sty} I0] 
Kur030uotp Jo 3utod ay} 5ο]τεοῖραί v PUL :,.0», ‘Stamsue JUOIM :,,T,; patoos ITE SIOASUE usy :Θ1ΟΝ 


0 rivi fI EE EP PE UL EL ea Te te U19jI 9U1ES 3} Το 
191164 I9ASU? ,,|t9pL, SUL 
οι ti 1 E I I LPATITITITPIPIIJ 1591 94} uo Wo}t JUO jo 
Π191154 IaMsue [ene JL 
οἱ οἱ αἱ 6| 6l OT] στι στι ST] 91| 03| Τό L6 1591 1911-05 ® 
uo 5πο51Θ4 91 JO 591005 
oy} Jo ΠΟΤΠΩ͂Ι1151Ρ πατᾶ 


way ue Jo ΛΊΠ ΙΟΤΙΟΌΏ 


1 91π3τ4 


240 . 


k (the number of alternatives), showing the 
option marked in error, or the letter “Ὁ”, 
showing an omitted item, is recorded. The 
computing shown at the right of Figure 2 is 
simple. The value of W is obtained by count- 
ing the blank squares to the right of the point 
of dichotomy, and this value can be checked by 
counting the number of error (and omission) 
squares to the left of the point of dichotomy. 
The measure of the conformity of the item, Ria, 
is obtained by substitution in the formula given 
above. 

When item responses are plotted as shown 
in Figure 2, the analysis provides very useful 
information. First, if desired, estimates of 
the seductiveness of the incorrect options of any 
item may be obtained by studying the distribu- 
tion of incorrect responses. Second, the diffi- 
culty of the item is given by P. For example, 
the difficulty of the first item shown in Figure 
2 is 76%, since 76 out of 100 pupils got the it- 
em right. Third, an estimate of the conform- 
ity of the item is given by Ria. 

It is clear that the conformity coefficient is 
merely an index of the agreement between the 
actual answer pattern and the ideal answer pat- 
tern of the item. It shows how well the item 
agrees with the total test score, that is how 
well the item ‘fits in with” or «conforms with" 
the total of the items. This coefficient can be 
used during the early stages of the construction 
of à test to denote those items which constitute 
a homogeneous, and presumably a reliable, 
group of items. 

The items which are most Satisfactory for 
use ina test are those which have the difficulty 
and the conformity which best meet the purpose 
of the test. No general rules will apply in all 
cases, It has been found, however, that for 
many purposes the most satisfactory item is 
one which has a difficulty level at or near 50% 
and a conformity coefficient of .20 or higher, 
These criteria levels are arbitrary and should 
not be slavishly followed. If one were able to 
select items solely according to their conform- 
ity, the selection might be made by ranking the 
items by conformity and choosing the most con- 
forming items. In actual practice, however. 
other things must be considered, suchas the 
difficulty of the items and whether the available 
items conform enough for the purpose at hand. 


3. Person Conformity 


If, inan item analysis, the plotting is done 
so that the individuals appear in the rows, with 
items in the columns, it is possible to use the 
Ferguson method to compute what may be called 
‘person conformities’’. The data for the anal- 
ysis are the same as those which may be used 
to analyse the items; only the arrangement of 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. XXII 


the data is different. The computing is similar 
to that shown in Figure 2: P is the score of the 
individual, rather than the difficulty of the item, 
Q is the number of errors and omissions made . 
by the individual, and W is the number of ‘‘m1s 
placed” items, defining a misplaced item as an 
easy item failed by a high-scoring individual τ᾿ 
a difficult item passed by a low-scoring indivi 
ual. The coefficient, Rja, is the correlation 
between the actual answer pattern of the person 
and his *'ideal" answer pattern. It is an index 
of the agreement of the individual's pattern of 
response with that of the group as a whole; it 
is a measure of his conformity. tic 
A non-conforming person is an idiosyncra 
person in the sense that he does not respond to 
the item as he would be expected to respond nS 
considering his total score on the test. Perso 
with relatively high total scores who fail meni 
of the easy items are non-conforming; perene 
with low total scores who pass many of the ΟΠ" 
ficult items are non-conforming. Person c 
formity, then, is relative to a particular gro 
of items. " 
Person conformities, unlike item Gan. 
ities, are not used to assist the test constroi ir 
to select items. They do, however, have pie 
ite possibilities as aids in helping the test t A 
retician to understand intricate item-p €T 9 p a 
relationships. Also they would seem to ha ee 
value for the teacher or guidance counsellor 
aids in diagnosing individual difficulties. E 
is a certain person non-conforming with re 
to the responses he makes on a certain 
An examination of the item pattern may 
invaluable clues for the remedial treatme 
the student's difficulties in the area repre 
ed by the test. 


pect 


ent- 


4. Empirical Results for Items and Person? 


: e 
One purpose of this paper is to examine t 
changes which occur in item conformities tons 
in person conformities under various cono! ute 
of test sampling. The first step was to CO 
the conformities of the items of a 75- ite 
chievement test in vocabulary at the grade ied 
level. The data for the analysis were ας st 

by selecting a representative sample of 
booklets from a group of approximately 
booklets completed by Ontario students, t° 
in May, 1946, during the construction of 
tery of achievement tests. .56 
The conformity coefficients ranged fro™ į 
to -. 15 with an average of .27, A number e 
the items of this experimental edition of upi! 
test did not agree with the total score, InP 
placement, sufficiently well to be include 
revised editions of the test. Approximate 
fifty of the items had conformity coefficier 


mit, 
-20 or higher so that, on the basis of 60 


ts ο) 
I 


1 


a (suy) gg «rp poggew sany uodo *osuodse 1991105011 D 
- (stosaad) ggg ** sq Ἡ MM 
pog = ης = ie gsuodseay 1091190 
: : 4951 
σαι 9STT 
[55] 
[5] 
E 
Ξ 
ή Uu 
8 


(3) sTenprArpuy 


ΡΟΙΠΘΙΙ ποϑπ519,1 :srs4[euy 1191] 


g 91π511 


March, 1954) 


242 JOURNAL ΟΕ EXPERIMENTAL EDUCATION 


alone, those items below .20 might be elimin- 
ated if a 50-item test were desired. 

The second step was to compute person con- 
formities; a person analysis was completed by 
re-arranging the vocabulary test data of the it- 
em analysis. The person conformities ranged 
from . Θ1 to .13 with an average of .41. From 
this it would appear that person conformity es- 
timates tend to run higher than item conformity 
estimates. It was decided arbitrarily to class- 
ify as non-conforming the eighteen persons 
whose conformity estimates were below .30. At 
present the level can be chosen only in terms 
of its convenience because very little research 
on person conformity has been completed. An 
inspection of the table of person conformities 
showed that a relatively high proportion of the 
non-conforming persons were among those with 
the lowest test scores—ten out of eighteen per- 
sons with conformities below . 30 appeared in 
the bottom quarter of the group, as determined 
by total test score, and fourteen appeared in 
the bottom half. Is it generally true that per- 
sons with low scores are less conforming than 
persons with high scores? This is a question 
worth investigating in future research. 8 


5. Sampling of Persons and Items for Com- 


puting Item Conformities 


It is a tenet of test theory that a reliable test 
is built by obtaining a group of homogeneous 
items. But if we are going to use conformity 
as a criterion in deciding whether to retain or 
eliminate an item, we should be sure that we 
are following a sound procedure in getting an 
estimate of conformity. Was the item-correl- 
ation low because the item incorrectly misplaced 
conforming persons (item conformity), or be- 
cause it quite properly misplaced non-conform- 
ing persons (person conformity)? Should the 
sample of persons providing the data for esti- 
mating the conformity of the item be chosen at 
random, after providing for representation of 
all segments of a certain population, or should 
it be made up only of conforming persons? 

Since the item conformity coefficient is ob- 
tained by computing the correlation between it- 

em performance and total test performance of 
a specified group of students, the size of the 
correlation depends upon the distribution of the 
total scores, which in turn depends to some ex- 
tent upon the manner in which the students in 
the item analysis sample were selected. If only 
students of a restricted range of abilityare used 


(Vol. XXII 


to provide data for the analysis, the conformity 
coeíficients of all the items will likely be re- 
duced. Also, since the total score is obtained 
by summing the results of the individual items 
it might happen that the apparent non-conform- 
ity of an item is due to the non-conformity of 
the other items with which it is associated. If 
the items as a whole are not ranking the stud- j 
ents in the proper order, the good item will PA 
show a high correlation with the incorrect ran i 
ing. The correlation of conformity depends up 
on the innate merit of the item itself, upon the 
value of the total items in aggregate, and upon 
the selection of the sample used in the item 
analysis. -- 

It'is the policy of the Department to admin 
ister the first experimental edition to à large 
representative group of Ontario pupils, m 
ably at least one thousand. When the tests ha e 
been scored the booklets are arranged in de 
scending order of total test score and an item 
analysis sample is selected by taking eve EM 
tenth, eleventh or twelfth booklet so as to ES. 
one hundred or two hundred booklets. We US 
two hundred booklets whenever we can but ag 
labour involved in the item analysis sometime 
makes it impractical to use as many as this, s 
particularly if there is a large number of item 
in the test. isis 

Since the sample used in the item analya the 
Sometimes relatively small, the make-up 0 3l 
sample is a matter of considerable theoretic 
importance. What students should be used? at- 
is obvious that the sample should be represen. 
ative of the population for which the test Ἐν τση 
veloped: experimental editions of the test 877. 
be administered to large representative BAD 
and the item sub-sample should be chosen ae 
representative manner. Other questions CO ae” 
cerning what is good practice remain. pet 
ample, should ‘‘non-conforming’’ students 
eliminated from the item sample? ΜΕ 

In interpreting item conformity statistic - 
the question arises as to whether the non-COP 7 
formity of certain items is due to the prese. 
of non-conforming persons in the item er et 
sample or due to innate weaknesses in the! it" 
themselves. To examine this question, on 
em conformities were re-computed after Moved 
eighteen least-conforming persons were Te it- 
from the sample. After this was done, ee 
em conformities of the most-conforming ite ott 
were computed a third time after both non^ 


i "ἀρ 8 
forming persons and non-conforming ite™ 
were eliminated, 


8. The cor 
gignific 
of the 10 


elation between total test score and person conformity estimate was .34, which is 
5 tly different from zero at the 1% level of significance, The total test scores 
SE persons in the analysis ranged from 66 to 19. 


mM TE 


March, 1954) 


6. : 
3. Comparison of Item Conformities 


ieee consisted of different estimates of 
[i Bi sig of the 74 items (one item was 
cordi ed because of scoring difficulties) ac - 
used = to the samples of persons and items 
ing ist og | these estimates. The follow- 
ities hice available: (1) the item conform- 
fanc were obtained when 100 persons and 
Bosh d used in the item analysis; (2) the 
ities se ies of the same 74 items when conform- 
most-cont Computed from data provided by the 
analysis orming persons of the original item 
Sons (as Sample—the 18 least-conforming per- 
ο σος by the person analysis) were 
ities we rom the sample and the item conform- 
ormities then re-computed; (3) the item c on- 
judged i. of the 51 most-conforming items (as 
S ne Ανα original item analysis) whenthese 
ed by die ies were estimated from data provid- 
he 8 most-conforming persons. 
the threr the item conformities obtained under 
i was ie ee of sampling were compared, 
Whole --.. that item conformities on the 
Persons nge very little when non-c ο nforming 
removed eaters items, or both, are 
items pm the item analysis sample. Of the 
tet a were conforming—had an r of .20 
Sample regardless of the make-up of the 
that do , Furthermore, many of the changes 
b Occur are small, no greater than might 


e ex 
Sample $^ d from the reduction in the size of the 


T 
esit, moval of the non-conforming persons 
ity esti 1 à general reduction of the conform- 
on~ m and an increase in the number of 
"elation Orming items. The average of the cor- 
25; the Coefficients was reduced from . 28 to 
increase mber of non-conforming items was 
74 eed from 23 to 28. Only thirteen of the 
ms y S Changed their conformity status: 9 it- 
forming ον Were conforming became non-con- 
Decamg and four, which were non-conforming 
qo Pmity ae nforming. The nine items whose con- 
Ormities 9PPed below . 20 had original con- 
150, had den «81 or lower. If .32, rather than 
pomi ο. used as the critical level of con- 
wating i ese items would have been non-con- 
ood appe the originalitem analysis. Thus it 
ls inated c. that non-conforming items can be 
μα of c either by setting a fairly low critical 
eT SOns onformity for a sample of conforming 
Rtormity by setting a higher critical level of 
ee f aN for an unselected sample of persons. 

fica, VAL, and probably not statistically sig- 
> increases in conformity occurred for 


` Reducti 
is undone of the item analysis sample 


8 tedl 
®timates y a factor in bringing δὶ 


» which are correlation coefficients. 


FOWLER 243 


four of the items. 

Most of the larger changes in the conformity 
estimates appeared to be due to the removal of 
non-conforming items. The item conformities 
obtained for the 82 most-conforming persons 
and the total of 74 items agreed very closely 
with those obtained for the 82 most-conforming 
persons and the 51 most-conforming items. We 
may tentatively conclude that when the number 
of persons is held constant, a moderate elimin- 
ation on non-conforming items will not greatly 
affect the conformity of the remaining items. 

The practical implications of the above ten- 
tative conclusions are: (1) Obtaining an item 
analysis sample by selecting a representative 
sample from a large group of test booklets ap- 
pears to be satisfactory; there is no need to 
eliminate non-conforming persons at the outset, 
but, asa safeguard, if the number of trialitems 
warrants it, the selection level could be raised 
from .20 to .30. (2) The presence of very poor 
(non-conforming) items in a test will not greatly 
affect the conformity of the other items; inother 
words, the item analysis done on the first ex- 
perimental ‘‘run”’ of a test will provide useful 
conformity estimates of the better items, esti- 
mates which should not be very different from 
those which might be obtained from later runs. 


7. Comparison of Person Conformities 


The available data, consisting of different 
estimates of the person conformities according 
to the samples of items and persons used, were 
as follows: (1) conformity estimates for 100 
persons when 74 items were used; (2) conform- 
ity estimates for 100 persons based on the 51 
most-conforming items; (3) conformity esti- 
mates for the 82 most-conforming persons com- 
puted from the 51 most-conforming items. 

The average person conformities for these 
three conditions of sampling were respectively 
.41, .34, and . 38. This suggests that the effect 
of eliminating non-conforming items is to re- 
duce the size of the person conformities, where- 
as the effect of eliminating the non-conforming 
persons is to increase slightly the conformities 
of the remaining (conforming) persons. The per- 
centages of non-conforming persons for the 
three sample sitiations were 18%, 36%, and 
23%. Thus it would appear that the percentage 
of non-conforming persons is increased when 
the non-conforming items are removed, but it 
is decreased when the non-conforming persons 
are eliminated. A comparison of the person 
conformities with the item conformities 8187 


from 100 to 82, which makes the sample more select, 
pout some reduction in the size of the conformity 


944 JOURNAL ΟΕ EXPERIMENTAL EDUCATION 


gests that person conformities tend to run high- 
er than item conformities. 

The most striking feature of the effect on 
person conformity of removing non-conforming 
persons is that the conformities for persons 
show a considerable amount of stability in the 
face of changes in the analysis samples: those 
that were definitely conforming remain conform- 
ing, and those that were definitely non-conform- 
ing remain non-conforming. Of the 100 persons 
in the original sample, 58 show conformity esti- 
mates above .30 regardless of the type of per- 
sons or items used in the analysis. 

It is clear that the effect, in general, of re- 
moving non-conforming items from the analysis 
is to raise the standard of person conformity. 
In other words, ona selected group ofitems 
fewer people will be conforming according to a 
pre-designated conformity level, than on 
an unselected group of items. As is often the 
case, the correlation between two variables (in 
this case the person answer pattern and the ideal 
answer pattern) is decreased when the variability 
of one of the variables is decreased. 

When non-conforming persons were removed 
from the analysis sample, only 7 out of 82 per- 
sons changed their conformity category. Thus 
we may say that when the number of items is 
held constant, an elimination of non-conforming 
persons will not greatly affect the conformities 
of the remaining persons. This agrees with a 
corresponding tentative conclusion for items. It 
appeared, however, that items (persons held 
constant) were somewhat more stable than per- 
sons (items held constant). 


8. Reasons for Item Non-Conformity and Per- 
son Non-Conformity 


Because of the interaction between persons 
and items it is difficult to decide whether non- 
conforming persons produce non-conforming it- 
ems or whether non-conforming items make 
persons non-conforming. Is non-conformity 
due to the interaction of items and persons or 
is it due to outside factors? Are items non- 
conforming because of weaknesses in their con- 
struction or because of peculiarities in the re- 
sponses of some of the persons in the item anal- 
ysis sample? A more detailed examination of 
some of the non-conforming items and non-con- 
forming persons may help to clarify these points. 
á A list of the 23 least-conforming items to- 
gether with data relating to them was prepared. 


(Vol. XXI 


For each of the 23 items the conformity coeffic- 
ient, the number of misplaced persons, and the 
number of responses to each option were tabu- 
lated. A study of this table suggested some hy- 
potheses concerning the reasons for item non- 
conformity. In some cases one or more of the 
incorrect options was weak; in other cases the 
options had double meanings or were not all 
grammatically parallel. One item appeared to be 
measuring both spelling and vocabulary, which 
may account for its non-conformity. Some it- 
ems were too easy —where the item is correctly 
or incorrectly answered by most of the pupils; 
the reliability of the conformity coefficient 18 
low.10 Since most of the non-conforming items 
appeared to be structurally inferior, their non^ 
conformity would seem to be largely the concern 
of the test constructor rather than of the person 
who is interested in the idiosyncracies of indi- 
viduals. 

Let us now examine the reasons for person 
non-conformity. Are persons non-conforming 
because of the presence of non-conforming items 
in the test or for some other reason? The ans 
wer to this question will interest all teachers 
who are diagnosing weaknesses of pupils and 6 
applying remedial techniques. A table was p 
pared which showed the conformity estimates 
the 18 least-conforming persons based on 74 i x 
ems, the estimates based on 51 items, andar 
information concerning the total score on the 
cabulary test, the total number of items mis 
placed, the number of easy items misplaced, 
and the number of difficult items misplaced. 

From this table, it was noted that perso? 
non-conformity appeared not to be due to the _ 
presence of non-conforming items. One per 
son, for example, misplaced 15 easy items a" 
15 difficult items, but of these only 13% of the 
easy items and 33% of the difficult items Were 
non-conforming items. His conformity, la 
was .13 for the test as a whole, dropped to - 
when the non-conforming items were remove ro 
from the test. In general, the persons who hee 
non-conforming for the test as a whole were? üp 
as non-conforming for the purified test made 
of conforming items only. We may conclude ap" 
that for the most part person non-conformity re 
pears not to be due to item weaknesses. s 
are, however, some persons who may be ad 
versely affected by non-conforming items. t 

It was also apparent that to the extent tha ak" 
person non-conformity may be due to item We 
nesses, the more difficult items are more 


ises because the conformity coefficient 18 not independent of the difficulty of 
10. This ar When an item iB very easy or very difficult, changes in the obtained conform- 


tho item. 
ity ooeffic 


fied by unp 


ient a8 great 88 .20 or .30 can be expected "by chancel, This has been veri- 
ublished research completed by the author. 


Γκ 


— ib. 4s r mam 


March, 1954) 


oifaeworthy thantheeasy items. For example, 
98%, eg easy items (difficulties ranging from 
πεν © 64%) **misplaced" by one person, two 
them: non-conforming items; of fifteen difficult 
n me difficulties ranging from 64% to 14% **mis- 
ο ες v) five were non-conforming items. In 
t Words, of the non-conforming items, those 
Seem most likely to add to person non-con- 
as eL are those that are difficult for the group 
inher ole—apparently because of weaknesses 
his ο. in the items, both good and poor stud- 
oose the incorrect options. 


for 


We may summarize by saying that one cause 


FOWLER 


245 


of person non-conformity may be item weakness 
or item non-conformity, but this may not be the 
only or even the most important cause. What, 
then, are the other factors which cause a student 
to get easy items wrong or difficult items right? 
What causes students to make the mistakes they 
do make? To what may we attribute the ‘‘indi- 
viduality’? of a person? In the case of an objec- 
tive-type test, part of the non-conformity may 
be nothing more than a function of his guessing 
the correct answers. On the other hand, there 
may be a ‘‘real’’ characteristic of the person 
which might be revealed by a case study analy- 
sis. The problem is worth further investigation. 


TABLES FOR TRANSMUTATION OF ORDERS 
OF MERIT INTO UNITS OF AMOUNT 
OR SCORES 


KENNETH E. ANDERSON, ROBERT T. GRAY 
EINAR V. KULLSTEDT 


School of Education, 


from iP FOLLOWING tables are adapted 
for tha ble presented by C. L. Hull* in 1922 
Or Hase ae of changing orders of merit, 
‘Binal form into normalized scores. In its or- 
Values of ες this table contained corresponding 
percent position" and normalize d 
* The “percent position’? was defined as 


100 (R - .5) 
N 


Where ; 
les an E the rank of the individual in the ser- 
Means d the number of individuals ranke d. 
Vide a s us table, then, it was possible to 
racteri οἱ of normalized scores on a given 
ran ing i for a group of individuals by first 
forming dem on the characteristic, then trans- 
Ormy 5 ranks into percent positions by the 
Correspo ema finally obtaining from the table the 
sit όνος normally distributed scores. 
sitions of i to obviate computing the percent po- 
Ted to n he individuals of a group when itis de- 
ind their normalized scores, the fol- 


Pro 


lowin 
Qon Ble were developed. They contain the 
" Toups ος corresponding to every rank 
order OL All sizes from 1 to 100 individuals. 
dividual find the normalized score for a given 
ua mn Hh. is necessary only to find the table 
dn ls in " responding to the number of individ- 
δ to the cr OUP and the table row correspond- 
Su i in of the individual in the group. The 
Pose an "s at their intersection. For example 
te vin aed ranks 8th in a group of 35 
or Sting tp. respect to a given characteristic. 
Class; c table column corresponding to ‘‘size 
Va], ΠΡ to qual to 35 and the table row corres- 
the © of 66 Mg in class” equal to 8, we find a 
the Score ci heir intersection. This value is 
wi etica ppt Of à possible 100, which would 
ayy n a gro), De made by the 8th ranked individ- 
distributed. 35, if the scores were norm- 


*o. 


85-5 ο 


University of Kansas 


Tables I - VI as adapted from Hull are based 
on a range of ranked ability arbitrarily cut off 
at a plus and minus 2.5 standard deviations. 
The baseline of his curve is 5 standard devia- 
tions and each of the 100 parts is equalto 0. 05 
standard deviations. Thus a rank of 2 in 50 
gives a percent position of: “ 


P= 100(R-.5) Ξ 5.00 
Ν 


Translated according to his table we obtain a 


score of 86. 
Table VII gives the normal equivalents of 
ranks in groups of all sizes from 1 to 25, where: 


T 250 10 (X - M) 
S. D. 


Thus, a rank of 1 in 25 has a percent position 
of: 


P-100(R-.5) -2.00 
N 


Referring to the unit normal curve, we obtain 
a x/o of 2.05. Thus the normalized equivalent 


of a rank of 1 in 25 is: 
T = 50 10 (2.08) = 70.5 or πι. 


If one wishes to extend Table VII for groups 
higher than 25, calculate the percent positionas 
before, look up the x/o value in a unit normal 
table, and calculate the normalized equivalent. 
For example, in a group of 31 individuals, we 
have: 


% 
Rank Position x/o T Score 
1 98. 387097 2.14 71.4 
2 95. 161291 1.66 66.6 


L 
Dp, ; Hui 
B. = "The Computation of Pearson's r from Ranked Data," Journal of Applied Psychology, VI (1922), 


248 


Rank in Class 


JOURNAL OF EXPERIMENTAL EDUCATION 


TABLEI 


ORDERS OF MERIT INTO UNITS OF AMOUNT 
(ACCORDING TO HULL) 


Size of Class 


i 2 3 L 5 6 7 8 9101112 13 1h 15 16 17 18 19 20 21 22 23 2h 25 

3 69 13 75 77 79 80 81 82 83 83 8l 85 8 86 87 87 87 88 88 88 89 641 
50 Ὁ 56 60 63 66 67 69 70 72 73 73 1h 75 76 76 77 78 18 79 79 79 80 8d 2 
L9 Ὁ 54 57 60 62 63 65 66 67 68 69 70 Τι ΤΙ 72 73 73 7h 7h 75 7 5 
L8 L6 50 53 56 58 59 61 62 63 6L 65 66 67 68 68 69 70 70 Τι 744 
7 37 L3 L7 50 52 55 56 58 59 60 61 62 63 6h 65 66 66 67 67 6 5 
l6 23 34 LO ll h8 50 52 5L 55 57 58 59 60 61 62 63 61 64 65 6516 
hs 21 33 38 42 L5 48 50 52 53 55 56 57 58 59 60 63 61 62 63 7 
bh 20 31 37 hl Lh L6 L8 50 52 53 54 55 56 57 58 59 60 608 
ΠΕ C19 30 35 39 L2 L5 L7 L8 50 51 53 Sk 55 56 51 58 τῆ ο 
L2 BN 18 28 34 38 L1 L3 L5 L7 L9 50 51 52 53 54 55 56| 10 
li 15 8 33 37 ho h2 hh L6 47 L9 50 51 52 53 Shia 
Lo 20 15 21 32 36 39 hl 43 L5 46 h8 l9 50 51 5212 
39 23 20 16 26 31 35 38 LO L2 ll hs 47 h8 [ιο 50|13 
38 3 3 26 23 20 16 N 15 25 30 3h 37 39 11 L3 bh L6 L7 LIJ 2h 
37 37 36 35 3h 33 31 30 28 26 23 2016 9 15 2h 29 33 36 38 Lo 42 L3 45 L5|i 
36) 39 38 37 36 35 3h 33 32 30 28 26 2h 21 16 ὃν 1h 2h 29 32 35 37 39 L1 43 hh] 16 
35|ho 39 39 38 37 36 35 33 32 30 29 27 2h 21 17 SX. 1h 23 28 32 34 37 39 LO h?]17 
3411 L1 ho 39 38 37 36 35 3h 32 31 29 27 2h 21 17 9 31 3l 36 38 LOJ 18 


33|h2 L2 hi ho 39 38 37 36 35 3h 33 31 29 27 25 21 11 9 27 30 33 35 37/19 
32|h3 L3 h2 L1 hO ho 39 38 37 36 3h 33 31 30 27 25 22 17 10 
3i|hh hh h3 L2 h2 ll ho 39 38 37 36 35 33 32 30 28 25 22 18 21 25 29 32|21 
30|h6 L5 hl hl h3 42 41 bo ho 39 37 36 35 3h 32 30 28 26 22 12 21 25 29|22 
29/47 L6 L5 L5 hh h3 be l2 hà ho 32 50 37 35 3à 32 31 28 26 23 
28|h8 L7 L6 μ6 L5 bh hh L3 L2 li lo 39 38 37 36 34 33 31 29 26 23 

27|h9 L8 L? L7 L6 L6 L5 hh L3 13 L2 hi ho 39 38 36 35 33 31 29 19 11 
26ἱ50 L9 LG L8 L7 L7 LÉ L5 L5 lh L3 L2 li ho 39 38 37 35 3h 32 2l 19 WN 


50 L9 L8 h7 L6 LS hh h3 h2 hl ΠΟ 39 38 37 36 35 34 33 32 31 30 29 28 27 26 


we 


(Vol. XXII 


55519 Ul HUVA 


(Exc iE -- 


Ma: 
rch, 1954) ANDERSON - GRAY - KULLSTEDT 


Rank in Class 


TABLE H 


ORDERS OF MERIT INTO UNITS OF AMOUNT 
(ACCORDING TO HULL) 


Size of Class 
26 27 28 29 30 31 32 33 23h 35 36 37 38 39 LO hi L2 L3 hh h5 L6 L7 48 L9 50 


i[89 89 90 90 90 90 90 91 91 91 ML 91 jl 92 92 92 92 92 92 92 92 23 23 23 93l 1 
2 81 81 Gl ὃν ac 82 82 03 03 83 83 Oh 8h Oh 8L ü5 85 ὃς 85 85 85 86 86 δέ 86 2 
3| 15 76 76 ΤΊ ΤΊ ΤΊ 78 78 78 19 79 79 79 80 80 80 80 81 81 81 81 81 82 82 82: 3 
i | τὸ 72 13 13 T3 th Th 7h 75 75 75 76 16 76 76 77 77 ΤΊ 77 78 78 78 78 D 79) b 
5 | 69 69 69 70 70 71 Τι T? J32 T0 12 13 D 13 Ih Τι Th 15 15 75 15 16 76 76 16] 5 
6 | 66 66 67 61 €8 68 69 69 69 70 1C το τι τι Τι 72 72 72 73 73 13 73 7h 7h Thi 6 
7163 64 6h 65 65 66 66 67 67 68 68 68 69 69 69 70 70 Το 71 71 71 Tl T? 12 72} 7 
8 | 61 62 62 63 63 ὅι 6h 65 65 66 66 66 67 7 67 63 BF OF 67 ο] εὖ 6B GB ὀρ 69 8 


59 60 be 65 65 66 66 66 67 67 67 68 68 68 69 69] 9 
60 61 61 62 62 63 63 6l 6l 65 65 65 6 67 6] εξ δὲ 61 61 6710 


64 64 64 65:12 
62 63 63 63|13 
61 61 62 62 1 
6 las 


NO p mv 


28 31 33 35 37 38 39 41 L2 l3 b 
25 28 31 33 35 36 38 39 LO T 


L5 L5 
L3 Lb L5 L6 L6 L7 L8 L8 [ο 49 50 51 51 51424 
19 24 27 30 32 3h 36 37 38 LO hl L2 L3 


UL hh hs μό L7 L7 L8 L8 be 19 50 50125 


= 
PPE 


ssv[) ut xueH 


249 


250 


Rank in Class 


JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE III 
ORDERS OF MERIT INTO UNITS OF AMOUNT 
(ACCORDING TO HULL) 


Size of Class 
51 52 53 54 55 56 57 58 59 60 61 62 63 6l 65 66 67 68 69 10 71 72 73 7h 15 


1 9393 93:93 93/93 93 2 B m oh 9h 9h 9l 9h 2 9L 9L οἷν 9h 9h 95 95 95 9511 
2 


86 86 86 87 87 87 87 8 7 87 88 88 88 88 8 


88 88 88 88 89 89 89 89 89|2 


3! 82 82 82 03 83 03 83 83 83 83 8l 8l, 8L 8l Bh 8l, Bh 8. 85 85 85 85 85 85 85,3 


| 66 66 61 67 67 


ERE 
H OYN ANE 
ον 
-ᾱ 


|65 65 65 66 66 66 66 67 67 67 67 68 66 6ὲ 66 68 69 69 69 69 69 70 70 70 
13 6ἱι 64 6L 6l 65 65 65 66 66 66 66 66 67 67 67 67 68 68 68 68 68 68 69 69 69 
162 63 63 63 6h 6L 6h 6h 65 65 65 65 66 66 66 66 67 67 67 67 67 67 68 68 68 


15|61 62 62 62 63 63 63 63 6l 6h 6l, 6l, 65 65 65 65 65 66 66 66 66 67 67 67 67/19 


79 79 79 80 80 80 80 80 80 80 81 81 81 81 81 81 81 82 82 82 82 82 82 82 82|ΐι 
16 77 77 77 ΤΊ ΤΊ 16 78 78 78 78 78 79 79 T9 79 79 79 79 80 80 80 80 80 80|5 
i 7L Th 75 15 15 15 75 16 76 16 16 16 77 77 77 77 77 77 77 78 18 18 78 78 Τ8ἱ6 
72 15 13 13 73 73 7h 7L 7h 7L 7L 75 15 75 75 15 15 76 76 16 16 16 76 16 77,7 
171 71 71 71 72 72 72 72 72 73 13 13 73 73 73 7h Th 7h 7h 7h 7h 15 15 15 158 
65 69 10 70 70 70 70 71 71 Τι 71 71 72 72 72 72 12 13 73 73 73 73 73 Th "hlo 
68 68 68 68 69 69 69 69 69 Το Το 70 70 71 71 Τι Τι Τι 10 72 72 12 72 12 72110 
68 68 68 68 69 69 69 65 69 70 70 70 70 69 71 71 Τι 71 "lini 


|13 


14 


16160 60 61 61 61 62 62 62 63 63 63 63 6l οἷν 6l 6l 6l 65 65 65 65 66 66 66 66|1i 


17159 59 60 60 60 61 61 61 62 62 62 62 63 63 63 63 6l 6l, 6h 6l 6l 65 65 65 65 
1858 58 59 59 59 60 60 60 61 61 61 61 62 62 62 62 63 63 63 63 64 6l, 6l, 64 6l 
19157 57 58 58 58 59 59 59 60 60 60 60 61 61 61 61 62 62 62 62 63 63 63 63 6h 
20|56 56 57 51 57 58 58 58 59 59 59 6C 60 60 60 61 61 61 61 62 62 62 62 63 63 
21|55 55 56 56 56 57 57 58 58 58 58 59 59 59 59 60 60 60 61 61 61 61 61 62 62 
22|5h 5h 55 55 55 56 56 57 57 57 58 58 58 58 59 59 59 59 60 6u 6u 60 61 61 61 
23|53 53 5L 58 55 55 55 56 56 56 57 57 57 57 58 58 58 59 59 59 59 60 60 60 60 
2L|52 52 52 53 5h 5h Sh 55 55 55 56 56 56 57 57 57 58 58 58 58 59 59 59 59 60 
25 51 5 52 52 53 53 53 Sk 5l 55 55 55 56 56 56 56 57 57 57 58 58 58 58 59 59 
21|h9 5 
281,8 
Solis by Lt UB ka le lp oe oo oo on oe ee 55 56 55 
30 Ὁ 50 50 51 51 52 52 52 53 53 53 5l, 5l 54 55 

31|h5 46 46 L7 L7 18 L8 li? L9 50 50 50 51 51 52 52 52 53 53 53 54 Sh Sh 54 55 


50 50 51 51 52 52 53 53 53 54 5l 5h 55 55 55 56 56 56 56 57 57 57 58 


32|lh L5 hs L6 Ló L7 L7 L8 48 L9 L9 50 50 50 51 51 51 52 52 52 53 53 53 5h 5h 
33|h3 LL Ll LS L5 Ló 47 L7 h7 L8 48 L9 L9 50 50 50 51 51 51 52 52 52 53 53 53 
3h|h2 43 L3 LL L5 L5 L6 h6 L? L7 L8 L8 h8 ho L9 So 50 50 51 51 51.52 52 52 53 
35 hi h2 42 43 Ll bb hi5 Ls L6 L6 47 L7 L8 L8 L8 lie L9 50 50 50 51 51 51 52 52 
36 [11ο hi hi 42 43 L3 hh hh L5 L5 L6 L6 L? L7 h8 L8 L9 l9 l9 50 50 50 51 51 51 
37|39 LO LO L1 42 42 43 L3 LL L5 LS L6 l6 L6 L7 L7 L8 L8 L9 ho i9 So 50 50 51 
3838 38 39 LO hl hl h2 L3 L3 hh LL L5 L5 L6 h6 L7 L7 L7 L8 h8 L9 L9 hi9 50 50 
39|36 37 38 39 LO LO hl 42 L42 43 L3 bh hl US L5 L6 h6 L7 L7 h8 h8 L8 L9 L9 Lo 
[ο|35 36 37 38 39 39 LO hl Li L2 L2 L3 ll LL LS LS L6 L6 L6 L7 L7 LB L8 L8 L9 


hi|3h 35 36 37 38 38 39 LO LO hl L2 L2 L3 L3 LL hh L5 h5 L6 L6 h6 L7 L7 L8 L8 
42132 3h 35 36 36 37 38 39 39 LO hl Ll L2 L2 L3 hh hh bb hs lis L6 L6 L7 L7 L7 
μλ|31 32 33 3L 35 36 37 38 38 39 LO LO hl L2 L2 l3 L3 hl hh L6 l5 LS L6 Ló L7 
hi|22 31 32 33 3h 35 36 37 37 38 39 LO LO hl hl h2 h2 h3 L3 hh hh L5 hs hó L6 
L5|28 29 30 32 33 3h 35 36 36 37 38 39 39 LO LO hi h2 l2 L3 h3 hh bh L5 LS L5 
46/26 27 29 30 31 33 3h 3L 35 36 37 38 38 39 LO LC hl hi L2 h2 μ3 43 hh hh L5 


O 51 51 52 52 53 53 53 5h 5h 5l 55 55 55 56 56 56 57 57 51 58 58 58 58. 


ο 
L9 Lg 50 50 5ο 51 51 52 52 52 53 53 5l Sk 5h Sh 55 55 55 56 56 56 56 57128 
L8 L8 LI h9 50 50 50 51 51 52 52 52 53 53 53 5l Sh 5h 55 55 55 55 56 56i 


17 
16 


29 


L5 
na 


uT7|2h 26 27 29 30 31 32 33 3h 35 36 37 37 38 39 39 ho Li hi 42 h2 43 43 hh bh|hT 


48|21 23 25 27 28 30 31 32 33 3h 35 36 36 37 38 39 39 Ὁ hlli h2 42 43 43 


Ὁ 39 ho LO li hi 2 


14 18 21 23 25 27 28 29 31 32 33 3L 3h 35 36 37 3 


LO Ly 
49118 21 23 25 27 28 30 31 32 33 3h 35 35 36 37 38 38 3 39 LO lili he 42 h3|h9 
50 7 38 3 


L8 
50 


(Vol. XXII 


Sse|[) Ut HULVA 


March, 1954) 


Rank in Class 


ANDERSON - GRAY - KULLSTEDT 


TABLE IV 


ORDERS OF MERIT INTO UNITS OF AMOUNT 
(ACCORDING TO HULL) 


Size of Class 


16 77 18 19 80 81 82 83 8h 85 86 87 88 89 90 91 92 93 9h 95 96 91 98 99 100 


1195 95 95 95 9 
2,89 89 89 89 8 
3|85 85 86 86 8 

83 8 


5 81 81 8 
6 19 19 1 
"Tt $T ΤΙ TN T 
8175 75 75 76 7 
9] Τὶν 74 7L 7h 7 
Ht 
7 
19110 70 71 717 
70 707 
168 68 69 69 6 
15|67 67 68 68 6 
16/66 67 67 67 
17|65 66 66 66 6 
18/65 65 65 65 6 
19| 6L 6h 6l 64 6 
20163 63 63 63 6 


67 67 67 


5 95 95 95 95 95 95 95 
9 89 89 90 90 90 90 90 ο 


2 86 86 86 86 B6 86 86 86 81 87 B7 


95 95 96 95 95 95 95 96 96 96 3 
$0 90 90 90 90 90 90 90 90 91 21 91 91| 2 


6 96 96; 1 
87 87 87 87 87 


3 83 83 83 83 8L 8L 8l 3L 8L 8L 8l 8l 8L 8h 84 85 85 85 85 85: L 


1 81 81 81 81 81 81 82 82 
9 19 12 


16 77 


6 76 16 76 76 
15 75 


4 15 75 15 15 


1 τι 71 Τι Τι 12 72 
ο 70 70 70 10 


6 66 67 61 61 67 67 61 61 68 68 
6 66 66 66 66 66 66 67 67 67 67 
i, 65 65 65 65 65 66 6 

L ók 6l 6l 6h 65 65 Ej 65 65 65 


82 82 82 82 82 82 82 82 83 83 83 83; 5 
19 79 80 80 80 80 80 80 80 80 8ο 80 81 81 31 δὲ 81 81: 6 
7 77 16 18 18 78 78 78 78 78 78 79 79 79 19 79 
τη T0111 ΤΙ ΤΙ 18 78 78 78 78 78} 8 
15 15 16 76 76 16 16 16 16 76 16 ΤΊ ΤΙ 11: 9 
3 73 73 74 Τὶ Th 7L Τὶ Τι 1h 74 15 75 75 75 15 15 75 
? τὸ 72 12 13 73 73 1313 73 73 D Th 7h 7h 7h 1h 7h 7h 7L 75/21 

12 12 1 
τα τα Τι Τι Τὰ Τι 11 72 12 12 72 12 72 
9 69 69 69 69 70 10 10 70 70 το Τι Τι 1 Τι 11 Τ1 τι 71 72 Τ2| 
8 68 68 68 69 69 69 69 69 69 69 1ο 70 70 70 70 70 70 71 72 71]15 
68 68 68 68 68 68 68 69 6 


13 13 13 13 73 73 Τὸ 13 782 


21162 62 63 63 63 63 63 63 64 6l Ob 

ΤΠΕ 
23/61 61 61 61 61 62 62 62 62 62 63 63 63 63 63 63 6l 6l 6l, 6h 6h 64 65 65 65123 
23} 62 61 61 61 61 62 62 62 6e EF Ea a Ga 2 63 63 63 63 63 03 ὅν GL Ga Os Gulan 
50 eo ος 60 60 60 60 60 61 61 61 61 61 62 52 62 62 62 63 63 63 63 63 63 63 6h77 
26/58 59 59 55 59 60 60 60 60 6ο 61 61 61 61 61 61 62 62 62 62 62 63 63 63 63126 
21/58 58 58 58 59 59 59 59 59 60 60 60 60 60 61 61. 61 61 61 62 62 62 62 62 62/27 
28,57 67 68 58 58 58 58 59 59 22 co 59 60 60 60 60 60 61 61 61 61 61 61 62 62|28 
29/56 51 57 51 51 58 58 38 28 58 89 59 59 59 59 60 60 60 60 6ο 61 61 61 62 61/29 
3056 56 56 56 57 57 57 57 20 20 2 23 28 69 69 59 59 59 60 60 60 60 60 60 613} 
ales ce ce £6 26 66 56 51 57 91 51 58 58 28 28 8 29 c9 59 59 59 60 60 60 693 
35] 55 55 55 66 56 56 56 57 57 57 57 58 58 8 58 58 58 59 59 59 59 59 60|32 
33| 54 Sh Sh 5h 55 55 55 55 56 56 56 56 57 $1 57 57 58 58 58 $8 58 58 59 59 59133 
3L E3 23 el ch eh eh 95 $5 55 55 56 56 36 56 56 51 57 51 51 58 38 58 58 58 5813. 
35152 23 23 e3 63 Sh 5h 9h 5h 55 55 55 52 26 56 96 96 57 91 57 21 21 29 58 58|35 
E a p 22 5} $3 5} 53 5h 5h Sh 5h 55 55 3 5 : 2 ze 5 E 51 51 2 Fi Al = 

25 56 56 

| ΠΕ Ch 65 25 55 55 55 56 56 56 56 38 


38|50 51 51 51 52 92 52 
[δ [ο 5ο το 51 Σι 51 τα > 
LO LO 41 hl 
Ην ΤΗΣ 
la i μ hô 19 i lo 50 50 50 SL 72 21 cl το 2 52 92 5 
L7 L7 18 L8 Lô l9 h9 L9 50 50 50 51 21 27 
18 ig 9 5050509221 δι δὲ ὁ 


L7 L 


Lh bs hs 
a hh hs 
43 43 bb 


7 
48 
L9 
50 


LG L6 L6 L7 
LS L5 L6 L6 Ló L7 
lh US LS h5 L6 Ló L6 hT 


8 L8 Lð 


μη L7 L3 L8 L8 LS by he b? 50 50 50 51212 
L5 LS L6 Ló i7 ho h8 LO ho ho h9 h9 50 50 50 51 51 51 51 52 52 52 5 
16 1 11 h7 LO UB μὲ HET po Lg L9 be 50 5o 5 z 

L9 L9 L9 50 50 50 51 51 51 21118 
h8 L9 L9 uy L9 50 50 50 51 51!h9 
L8 L8 L8 L9 L9 ἰι9 L9 50 50 50j50 


$i 
M? μὰ b ph ED Mo bg us ἰδ 46 hó h7 b] b L8 


nu [8 L8 Lô LI 


L7 L7 L8 L8 


2 
£i 51 51 52 52 52 52 5 


$0 51 51 51 51 52 


SS'*[) UI HUVA 


252 


Rank in Class 


JOURNAL OF EXPERIMENTAL EDUCATION 
TABLE V 
ORDERS OF MERIT INTO UNITS OF AMOUNT 
(ACCORDING TO HULL) 
Size of Class 
_ Τό 12 50 61 82 83 0h 85 86 87 88 89 90 91 92 93 9l 95 96 97 yB 99 E 

5 17 19 21 22 2h 25 26 27 28 29 30 31 31 32 33 33 22 35 35 36 36176 

75 ll, 17 19 21 22 2h 25 26 27 28 29 30 30 31 32 32 33 34 34 36 35 36177 
7hí33 5 11 1h 17 19 21 22 2h 25 26 27 28 29 29 30 31 32 32 33 33 3h 36 35178 
73115 11 5 11 14 17 19 21 22 23 25 26 27 28 29 29 30 31 32 32 33 33 3h 34179 
12|18 15 1h 17 19 20 22 23 25 26 27 28 28 29 30 31 11 32 33 33 3180 
71, 20 18 11 1h 17 19 20 22 23 2h 26 27 27 28 29 30 31 31 32 32 3381 
70122 20 2 10 1} 16 19 20 22 23 24 25 26 27 28 29 30 30 a 32 3282 
69123 22 5 10 1h 16 18 20 22 23 2h 25 26 27 28 29 30 30 31 32/83 
62! 25 2h 6 2 10 lh 16 18 20 22 23 2h 25 26 27 28 29 29 30 318] 
67126 25 12 6 5 10 1h 16 18 20 21 23 2h 25 26 27 28 29 29 30|85 
66128 26 1612 6 5 10 1h 16 18 20 21 23 2h 25 26 27 28 28 29186 
65|29 28 181612 6 5 10 13 16 18 20 21 23 24 25 26 27 28 28187 
6h 30 29 21 19 16 12 5 10 13 16 18 20 21 22 2l, 25 26 27 27/88 
63|31 30 25 21 19 16 12 5 10 13 16 18 20 21 22 2h 25 26 26|89 
62|32 31 2h 23 21 19 16 12 2 10 13 16 18 19 21 22 23 2h 25/90 
61|33 3? 26 25 23 21 19 16 12 6 5 10 13 16 18 19 21 22 23 2h91 
60|3h 33 27 26 25 23 21 19 16 13 5 10 13 16 18 19 21 22 23|9? 
59135 3} 69 28 26 25 23 21 19 16 11 6 5 10 13 15 17 19 21 22199 
58:36 35 31 30 29 28 27 25 23 22 19 17 13 N 5 10 13 15 17 19 21194 
57|36 36 31 30 29 28 27 25 2h 22 20 17 13 Th. h 10 13 16 17 19195 
6137 37 32 31 30 29 28 27 25 2h 22 2017 13 7 hl 9 13 15 1196 
55136 38 33 32 32 31 29 28 27 26 2h 22 20 17 11 7 b 9 13 15/97 
51139 38 3L 34 33 32 31 30 29 27 26 2h 22 20 1 13 T l 9 1398 
53,hO 39 39 38 37 37 36 35 35 31 33 32 31 30 29 27 26 2l 22 20 iT a3 7 lh 9:99 
52 |LO LO 39 39 38 38 37 36 36 35 3ἱ 33 32 31 30 29 28 26 25 23 20 17 il {100 


ci fLl hl LO LO 39 38 38 37 36 36 35 3l 33 


32 31 30 29 28 26 25 23 21 18 1l, 7 


75 7h 73 72 71 70 69 68 67 66 65 6l 63 62 61 60 59 58 57 56 55 54 53 52 51 


(Vol. XXI 


SSv[) Ul JULY 


> 9 
RE 


March, 1954) 


Rank in Class 


13 
7h 
T 


ANDERSON - GRAY - KULLSTEDT 


TABLE VI 


ORDERS OF MERIT INTO UNITS OF AMOUNT 
(ACCORDING TO HULL) 


Size of Class 


32 33 33 3h 35 35 36 37 37 38 38 39 39 40 


30 31 31 32 33 3h 3L 35 36 36 37 37 38 38 39 39 LO ho LO hi ii 
29 30 30 31 32 33 34 3h 35 35 36 37 37 38 38 39 39 39 LO ho hi 
22 28 29 30 31 32 33 33 3h 39 35 36 36 37 37 38 38 39 39 ho he 


15 17 19 21 


23 
11 15 17 19 21 23 2b 25 


i ur L2 3 13 b b iy pu bi US LS Ló L6 hé hó L7 b7 L7 ἰδ h8 pA De j 53 


ον 
sng 


39 ho ho Ll Ll L2 h2 43 h3 h3 hl bb L5 LS h5 L6 L6 h6 L7 L7 L7 
39 39 ho LO hi Ll ha μὲ 42 h3 L3 ll Ub bb bo LS LS L6 66 L? 4? 47 
38 38 39 LO LO LO hi hi he be h3 L3 43 hh bb LL L5 L5 L5 ιό L6 L6 L7 
37 38 38 39 39 LO LO hl L1 L2 he L2 L3 h3 Ub Lb bb 15 o L5 h6 L6 ἡ 
36 37 37 38 39 39 LO LO LO Ll hl L2 2 L3 L3 L3 bb bb hh &S L5 4S 
35 36 37 37 38 38 39 39 LO LO Ll LI kl 42 L2 L3 L3 h3 hh hh L5 L5 
35 35 36 37 37 38 38 39 39 LO LO Ll Ll Ll L2 be h3 h3 h3 hl Lh Lh 
3h 3h 35 36 36 37 37 38 20 32 39 ho LO hi hi L2 L2 h2 L3 h3 L3 nnn 
33 34 3h 35 36 36 37 31 38 38 30 ασ LO LO 41 41 hi h2 L2 h2 L3 

ho ho ho hi Ll he be be L3 lh 


ο ο δρ 
ue Pov. 


31 32 32 33 3h 35 35 36 36 31 31 38 38 39 7 LO LO hi Ll Ll 42 L 


26 27 28 29 30 31 32 32 33 34 34 35 36 36 37 37 38 38 39 39 39 L 
25 2627 28 29 30 31 32 32 33 3h 34 35 35 36 37 21 38 38 38 39 3? 
23 25 26 27 28 29 30 21 31 32 33 34 3h 35 35 36 36 31 37 38 38 39 3? LO οι 7 
22 23 25 26 27 28 29 30 31 31 32 33 32 3 35 35 36 36 31 37 38 38 39 39 39/1 
20 21 23 2l 26 27 28 29 30 30 31 32 33 33 34 3h 35 36 36 37 21 3 

17 20 21 23 2h 25 27 28 29 ?? 30 31 32 32 33 34 2h 35 35 36 36 37 
ol 25 26 27 28 29 30 31 32 32 33 33 3} 35 35 36 36 37 37 38) 7h 
26 27 28 29 30 31 31 32 33 33 3u 35 35 36 36 37 37 15 


SSe Ul JULU 


253 


JOURNAL OF EXPERIMENTAL EDUCATION 


254 


Rank in Class 


cO c0 v io CO F- cC ο 


0L 69 69 69 69 89 89 89 79 19 99 99 99 G9 p9 


0ος 61 8T LT 9I GT PI εἰ σι IT OT 6 8 4 9 
55519 JO 6515 


‘a's 
(W-X)0T + 04 =. SHSHA SNNVH JO SINT TVAINOT ΤΥΥΠΙΟΝ 


ΠΛ ΠΊαΥΙ, 


69 
v 


09 LS 0€ 
€e @ I 


March, 1954) 


ANDERSON - GRAY - KULLSTEDT 255 

3 91.935485 1.40 64.0 1/31 = 3. 225806 

4 88. 709679 1.21 62.1 98.387097 - 3.225806 - 95. 161291 

5 85. 483873 1. 06 60. 6 

f The use of scores translated into units of 

31 amount according to Hull for purposes of correl- 

Ave ation, will produce slightly higher correlations 
ti Onstant may be subtracted each time to ob- than when T scores according to the unit normal 


nthe next percent position as follows: curve are used. 


A EMPIRICAL INVESTIGATION OF THE PROB- 
LEM OF DISPROPORTIONATE FREQUENCIES 
IN ANALYSIS OF COVARIANCE AS APPLIED 
TO A METHODS EXPERIMENT 


DAISY STARKEY EDWARDS and SIDNEY J. PARKIN 
University of London 


T 
cou E EXPERIMENT to be described en- 


der ned difficulties familiar to many who un- 
from Š ge in the classroom. They arise 
a iras that it is seldom possible, in prac- 
ormi in an experimental structure exactly 
theo E to the requirements of statistical 
Nore int n England, the yearly entry of 100 or 
to « or à secondary school is often divided in- 
and liere S ’ of different intellectual abili ty, 
Tandom ' for instance, it is not likely that a 
Vill be fo Sample’? for experimental purposes 
ΝΣ πα ‘tin situ” in the classes of any giv- 
8 enyj,, arrangement in randomised groups 
choo tah Lindquist (1) could so disrupt 
ἘΣ Authority sation that the co-operation of those 
wen tho T neu not be readily forthcoming 
peri ugh they were favorably disposed towards 
να ία] work, 
$ equa] erences are almost certain to be due 
maa S in erano mental capacity of the indi- 
uA Surapl he various classes, or some other 
ca 8 the S quantity, they may be overcome by 
es ethod of analysis of covariance in 
i a the normal methods would be the 
qui oSed ο variance. A second difficulty, not 
care Tathe So easily, is that many designs re- 
coe i fin rigid restrictions on the number of 
ed Venience experimental units. If, again, the 
in SSe ien of the schools is studied, and the 
the the fact e altered in numbers, appreciat- 
tig TY lead bsp existing methods based on exact 
to ho? evide © prohibitively laborious calcula- 
de, 8 ommitte on the nature of the error likely 
Vd ting E ted in using numbers only slightly 
in κα Ν “πρι the ideal theoretical condition 
ime Statist; uable, Both matters are considered 
Pw ical handling of the data of the exper- 
of a Was n 
in oth a eed to compare the relative effects 
Yea ber ise €e methods of presenting diagrams, 
loys Secon Similar lessons, to childrenoffirst 
less traini Ty school age. In order that prev- 
beg 2 those Should not affect the results, the 
qu met Dec Was on a topic not likely to have 
a Tation €viously by the children, and involved 
e to. 
"d ge vog chosen was “The Construction ofa 
Ίρρετ”) and all groups were given 


the same lesson by Mr. Parkin. The lesson 
could be conveniently illustrated by means of 
three line diagrams. The first showed the shape 
and names of the parts which are sewn together 
to make the upper of a simple slipper, the sec- 
ond was an “‘exploded”’ line diagram showing 
the upper ona last and the various components 
which have to be attached to complete the slip- 
per, and the third was a perspective line sketch 
of the finished slipper with the toe cut away to 
show in section the relative positions and mode 
of attachment of the various components. The 
lesson, which took about fifteen minutes, follow- 
ed the order of practical procedure of slipper 
construction and a short recapitulation was made 
when explaining the third diagram. 

So far as was experimentally possible the only 
difference between the lessons lay in the method 
of presenting the diagrams. In one case they 
were built up freehand one at a time on the black- 
board, in the second case they were presented 
in the form of projected images from three film- 
slides, and in the third case they were present- 
ed in the form of three prepared wall c harts. 
Both wall chart and filmslide diagrams consisted 
of white lines on a black background so as to con- 
form with the blackboard diagrams, and the size 
of the illustration presented in all cases was kept 
as uniform as possible. The experimenter 
took care to build up the board diagram in the 
manner in which he would normally illustrate a 
lesson. He considers that he is competent, but 
not exceptionally gifted, in blackboard work. 

In England, allocation of children into various 
types of secondary school takes place at the age 
of eleven, approximately, mainly on the basis of 
a selection examination which in the district con- 
cerned consists substantially of tests of intelli- 
gence and attainment in Arithmetic and English. 
Roughly, their order in the examination places 
them intoGrammar, Technical or Modern 
Schools, though parents occasionally suggest 
that they wish a child to go into a Technical 
School instead of a Grammer School even though 
his performance entitles him to a place in the 
latter. Since the education offered in these 
types of school is different, the results of the 
experiment might be affected by this fact, and 


258 JOURNAL OF EXPERIMENTAL EDUCATION 


all types of school had to be included in the in- 
vestigation. There was also the possibility that 
boys and girls might have different results, so 
that it was necessary to include girls’, boys’ 
and co-educational schools, and also it was ar- 
ranged that the main comparison, between meth- 
ods, should in each of the three cases, have ap- 
proximately equal numbers of boys and girls. 
To allow for inequalities in intelligence of the 
children in the various classes, a preliminary 
Short intelligence test consisting of items on 
order, correspondence and analogies was ad- 
ministered. This test is not available in pub- 
lished form, but was used in a County selection 
examination elsewhere, and had been carefully 
standardized. Without entering into any discus- 
sion on the factorial contents of the test, it was 
eventually found that its results had a significant 
correlation with those of a short test given im- 
mediately after the lesson on the facts taught in 
the lesson, used as the final measurment in the 
experiment, and it was concluded that its effect 
Should be eliminated from the methods of com- 
parison. The methods were, within Schools j 
and with one small exception to be described 
later, allocated at random to the classes. 

One requirement of the two-way comparison, 
with schools and methods as the main effects, 
could not be met exactly. This is the numerical 
restriction which requires proportionate frequen- 
cies in each combination of method and school. 
The numbers actually made available are given 
in Table I. 

The elimination of the intelligence variate 
was to be attained by the use of analysis of co- 
variance. Analysis of covariance is a refine- 
ment of analysis of variance, designed to cover 
certain cases in which the simpler technique 
could not secure sufficient control of error. It 
is usually used in cases comparable to the pres- 
ent one, where intrinsic variations in the exper- 
imental material give rise to heterogeneity in 
the results which is irrelevant to the investiga- 
tion, and neglect to take account of these may 
lead to negative or even wrong conclusions. Our 
preliminary measurement of intelligence of the 
individuals in the experimental groups, estimat- 
ed by the experimenters as an attribute of the 
individuals which will almost certainly affect 
the results, is used to make an adjustment to the 
final readings, provided that it is verified dur- 
ing the analysis that the assumption of its effect 
is correct. 

We give a brief account of the rationale of 
the method of analysis of covariance which, itis 
hoped, will clarify the method of treatment of the 

data obtained in this experiment. : It may be said, 
in essential, to be a way of allowing for the lin- 
ear regression of the final variate, y, on theone 
or more initial Tasa qe ne aR u η by carry- 
ing out an analysis of var e deviation 


(Vol. XXI 


of y from its regression estimate. Cons ider 
the case p - 1. Let yts be the t-th observation 
in the s-th group, and y'tg be its deviation from 
the sample mean, similarly for the x's. Suppose 
we are considering the simplest design of exper 
iment, a “‘between-within’’ comparison between 
groups. Taking a null hypothesis of no differ- . 
ence between groups, and using the least square 
approach with corresponding population param 
eters, it is easy to prove that an F-test is ap^ 
propriate. " 
The form of the F ratio indicates the follow 
ing computational procedure, First, split the 
sums of squares, Zx'2, Zy'?, ΣΧ! into i om 
tween groups” and **within groups? components; 
the three components of the latter provide ed 
error term, the denominator of F. The mat 
matical procedure having shown that this i$ "T 
Z(y'- bx')*, within groups, divided by an appt‘ 
priate number of degrees of freedom, we CODE a 
pute the value of this square sum as Zy' -xy 
The numerator of F is found to be based on QA 
difference of a quantity similar to the eiii : 
puted from sums and products ‘for totals and 7 
“within groups" square sum. Unlike ane 
of variance, there is no alternate procedure 
the latter using ‘“‘between groups” squares E 
products, as this would result ina quantity 
mathematically less than or equal to the abov 
difference. ved 
If the mathematical procedure were όση 
out in more complicated cases, it would be the 
that the following rule of thumb procedure 2 τμ, 
appropriate generalisation of the above find! Sign 
for one independent variate x only, in any Mam 
where the group frequencies are proportion 


we 

1. Use at all stages of the analysis that UAM 
bx')? = Z(y'^) - bZ(x") where b is the T€ 
gression coefficient appropriate to the pe 
ticular summation. 1) into 

2. Split up the sums Z(y'?), (Zx'2), Z(x'y) 1 
components appropriate to the problem an 
design of experiment, but 

3. Instead of immediately computing Z(Y 
on each line combine the last line of th 
columns of this preliminary analysis (er c 
with the components corresponding to 68 in 
main effect in turn, and each interaction * 
turn, calculate Z(y! - bx')? on each of 5 
new lines formed, calculate this sum als i 
on the error line and fill in Z(y' - bx)" 0 
the original lines by l- 

4. subtraction of the error value from the ret 
ue of this sum on the line representing t sd 
combined effect of error and the ΔΡΡΓΟΡΣΣ 
main effect or interaction. 


e pres” 


In case of any doubt, for instance in th mati” 


ence of significant interactions, the mathe 


p- 


March, 1954) 


EDWARDS - PARKIN 


TABLE I 


NUMBERS OF CHILDREN AVAILABLE IN VARIOUS GROUPS 


ethod Total 151 151 


School 
School Filmslide Wallchart Blackboard Total 
Secondary Modern 
Boys 23 18 26 67 
Selective Central 
Mixed 22 26 31 79 
Girls Grammar 28 31 32 91 
Boys Grammar 24 28 27 79 
Secondary Technical 
Boys 29 27 28 84 
Secondary Technical 
Girls 25 21 25 Ti 


Method Total OoOo 198 00 Y o oo 


TABLE II 
PROPORTIONATE GROUP FREQUENCIES 


School 


School Filmslide Wallchart Blackboard Total 


1. 18 18 24 60 
2. 21 21 28 πο 
3. 24 24 32 80 
4. 18 18 24 60 
5. 21 21 28 πο 
6. 18 18 24 60 
Method - m 


Total 120 120 


259 


———MÀ — — 18 


ile att — À— mim d hl durum 
S[oou?g x ΡΟΙΠΘΙΝ 
νο & TT STS | €2°O€0T 68 ‘TES 06581] 89868 ΡΟΠΙΘΙΝ 


l peony (€ lee ee E amy — 


pejsnfpy əƏI mog 
(stoouog 9) 


SdA(089 ALVNOILYOdOUdSI_G 


(Vol. XXII 


AI 4'IS VIL 


Td 86'9960T| LL°09% | O8'ILLT δι 12211] 46 ΒΕΟΟΙ mE Ποῃ]οΈΙΘΊΠ + IOIIG 
ma 86 6ΟΡΙΤΙ 84966 | 08 'TAPP | 96'G9PZI| 99°S9002 NN [OOQ9S + IOIIM 


JOURNAL OF EXPERIMENTAL EDUCATION 


L0 T 


PP 667 | |. | OL ‘ZTP IS "PLE vv 1606 ία uonoeroju 
S[oouog x ΡΟΙΠΘΙΝ 
E. 16 091 | €8 “TOS ΚΚΕ ΓΈ | TA EIS] GL '8TIOT LN 
ΡΟ ΒΙ | 18'019 | I9 T201 L6°29S | δ6ῬΘΙΙ] 
(10 LT) 


posnt bal Kad od i b i 


s 


99UTLITA JO 
921nog 


SdNOUD WLVNOLLHOdOHd 


260 


III Ταν], 


261 


EDWARDS - PARKIN 


March, 1954) 


>" 
- οὐ O8 * 


[y 
{) 


PD O7 d»900* Ge ‘OT 91 >d L0°LT 
onjeA uonove19qu] S x ΙΝ au} uo peojeurmnse A 


0j 
1' 


117 d»9' 


ποτ]οτεαθ]] 
spoouog x ΡΟΙΠΘΙΝ 
1 >d Or'»d jooyos 
wa lore ρα pane 


soos | stows 


gdnoair) e3euor10doddstq 


eee 


9}eu0T}10d01g 


Ai JO SAN TVA 


IA A'IGV.L 


σσ. GO 8cOpIl LO 917 | G9 οσισ] CO'GEGPI| LF LESS a uonoeieju] + IOIIG 
ΑΚ ετοβεντ]ος τεντ] εἰ 8009 | Sh OZO9T| LG Teese ΚΑ yooyos + 1011π 
NEN NN pO Tip Ilp ELE | 6L αθζς | 8L'P8LPI| 8935051 aM pouye + Oly 
az Es 89 °Gepell 8p pee | CC POLL | 91 09951] 9v 819501 =) IOIIG 
667 
796 I 01, σα | LE σεο Sy ZOOTI | 98718 00'6062 | ZI uonoeioeju] 
spoouog x POJAN 
E: P Z6 S61 | Ῥ4᾿5911 zu NEN 16 5865 | Lc 0966 iun ME [00q9S 
696 
81 ST μμ G8 086 ||] 4X ΠΗ; c9 PZI Zo LGE κα ῬΡΟΙΠΘΊΝ 
(156 ) 
yx. 


A πια 20  [nXX Λι XZ giAZ εχ ΘΟΙΈΤΙΈΛ JO 
əə moş 


(sioouo»s 2) 
SdNOUD H.LVNOLLHOdOHdSIG 


A A'IHVL 


969 TOURNAL OF EXPERIMENTAL EDUCATION 


calanalysis may always be used, but experience 
is often a very good guide. An example of the 
form of calculation will be given later. 

The method would, of course, only be used 
in a case where a significant regression of y on 
x is found. If significant differences are pres- 
ent between class means, the regression found 
from the total sums of squares and products is 
apt to be misleading, and judgment of the useful- 
ness of the method is usually based on the re- 
gression coefficient calculated from the en- 
tries on the error line. The presence of two or 
more initial variates x significantly related to y 
involves carrying columns in the analysis to con- 
tain the sums of squares and products of the ad- 
ditional variates, and involves working with par- 
tial regression coefficients b,,b;,...bp. The 
quantity subtracted from Z(y'?) on any line to 
give Z(y' - b,x} - Ροκ»... is a generalised 
form of b?Z(x'?), (which may also be written 
bZ(xy)), i.e., b,Z(Qxu y) + baZ(x2y) 4... b 
(xpy), the relation to the general theory of re- 
gression being immediately obvious. 

As has already been stated, the dispropor- 
tionate frequencies create a difficulty. Follow- 
ing Snedecor’s suggestion (2) we may show that 
there is no significant deviation of the frequen- 
cies from proportionality. (X? = 2.79, P = 99%) 

To check a possible effect of disproportion- 
ality, we propose to compare the results of an- 
alysis in which this is disregarded with one in 
which certain of the observations have been 
dropped by random choice. The extent of theal- 
teration in the experimental numbers is shown 
by comparing Tables I and Π. 

This method of approach is unconventional. 
In a case-as close to proportionality as that pre- 
sented by the original data, instead of following 
the laborious procedure of recalculating totals 
corresponding to expected frequencies, as sug- 
gested by Snedecor (2), or the method of fitting 
constants suggested by Yates (3), and Stevens (4) 
illustrated in the British Journal of Psychology, 
Statistical Section (5), an analysis of the com - 
plete set of data has been carried out as though 
its frequencies were proportional, and the re- 
sults have been compared with those of the first 
analysis. Educational data frequently pr esent . 
very small departures from proportionality, and 
the writers feel that it is desirable to present ex- 
perimental evidence of this type on the errors 
likely to be committed on an assumption of pro- 
portionality. No generality is claimed for this 
empirical approach, though the writers are of 
the opinion that the extent of the departure of 
their data from proportionality is fairly typical 
of much educational data. 

The primary results obtained in the analysis 

xactly proportional figures are shown within 
of e: avily outlined rectangle in Table III. 
u— m the error line of this analysis we obtain 


(Vol. XXI 


an estimate, .127, of the ‘‘within groups’’ Cor- 
relation between y and x, and with 382 d.f. this 
is evidently significant compared with the P = 
.05 value of . 100 and almost as much as the P 
= .01 value of . 132 computed from the standard 
error of zero correlation for this number of de 
grees of freedom. — 

The analysis of y? with the effect of x enmi 
ated according to the method previously pu 
is shown in Table ΤΠ, and the significance oft 
results is recorded in Table VI. It should be - 
observed that, owing to the use of ‘‘b’’, 2 dene 
tion of the observations, in computing the poeni 
sum of squares, the number of degrees of mo 
dom of the error term is one less than it wou à 
have been in the analysis of the y? values 1d ; 

Figures in the seventh column of Table - 
are obtained by subtracting the figures in t 
Sixth column from those in the fourth column; 
i.e., Ed? -Zy'? -b^zx'?, The F values = te 
parentheses are obtained by using the intera Ἢ 
tion variance instead of error variance ag facte 
which to estimate the main classification a fol- 
Any discrepancies in the figures in this anc ^^. 
lowing tables are due to the fact that the fig" und" 
used in the original calculations have beenro Ὁ 
ed off to two decimal places for the purpose? 
this article. 

The analysis of the whole set of observa 
is shown in Table IV. ri- 

The data originally collected in the expe dern 
ment included results from a Secondary Mo in 
Girls'School. Owing to a technical hitch three 
this school the numbers being taught by the ein 
methods were very disproportionate to on 
the other schools. They were, respective n 1l 
18, 19 and 9 in methods 1, 11 and 111. [^ 
results including these X? = 7.753, P ὃ da 
12 d.f., and we include the analysis of this 
obtained with an even more unsuitable set two. 
frequencies for comparison with the other 

Even with the extremely disparate num ΓΕ 
in the three experiments we obtain values 0 
for which the probabilities are very similar 
Table VI illustrates this point. " 

Finally, to support the findings regarding 
method differences, we include for interes r- 
3 X 3 Latin Square analysis which was dan - 
ately incorporated in three of the schools ἃ om 
ysed. This represents a small departure fT 10 
the statistical requirement that the allocatio" 
the methods in each school should be random 
but it is not evident that such a collective ranke 
dom allocation to the three schools should ge 
a serious difference to the results. It is D^, 
haps more important to notice that these y πρ d 
Schools were Grammar and Central Schools" 
that the results of analysing their results 2^. 
thereby restricted to children of fairly high ols 
telligence. Since, however, these three $C ^ y 
were all divided into three streams, and ΟΠΘ 


tions 


March, 1 
dad EDWARDS - PARKIN 


TABLE VII 


LATIN SQUARE RES ULTS 


Source 
of Variance 


Methods 2 314. 67 157.34 5.87 
Intelligence (Streams) 2 108.22 54.11 2.02 
Schools 2 πο. 98 35.49 1.32 


6569. 12 26.81 


Error 


TABLE VHI 


VALUES OF : 


Methods _ Signif- Methods Signif- Methods Signif- 


Propo i 1 απά 11  icance 11 απά 111 icance land 111  icance 
rH 
i lonate Frequencies 1.21 n. 8. 5.71 -1% 4.41 1% 

Sp 

Poor tionate 

" ncies (6 schools) .56 η. 8. 5.55 -1% 4.97 1% 

Pro 
Preg Pottionate 
quenci 
encies (7 schools) .01 n. S. 5.25 1% 5.23 . V 


ee iium Ree 


264 JOURNAL OF EXPERIMENTAL EDUCATION 


the main effects comparisons is “between streams’ 5 
while the ‘‘methods”’ comparisons to some ex- 
tent eliminate streams and therefore intelligence 
differences, the results of the analysis are of in- 
terest as they probably illustrate the increase 
in precision gained by using analysis of covari- 
ance which attempts to eliminate the effect of in- 
telligence more exactly, 


The significance 
of freedom are 3.0 
-1%. We see that th 
is not significant at 


(Vol. XXII 


(7 schools). 

The significant differences are all in favour 
of the blackboard presentation. We conclude, 
therefore, that on the evidence presented, 


1. Results of an analysis carried out by the 
method of analysis of covariance are not 
greatly affected by a moderate degree of 
disproportionality of the figures even 
though the method used is appropriate to 
strictly proportionate numbers. 


2. Blackboard illustration built up in the 
course of a lesson by a reasonably com- "T 
petent illustrator gives superior immedia 
results to those obtained by illustrating 
with wallcharts or filmslides. 


It is of interest to note that the results of a 
delayed recall test presented one month after - 
the lesson to all test groups confirmed the pos 
lority of the blackboard although the actual diff 
ences between the methods means were smaller 
than in the immediate recall test. 


REFERENCES 


i, Lindquist, E. F. Statist; y: 
l νο Ες tistical is i E 
ucational Research (Cambriqa αν 
A κο Press, 1940), : j 
+ Snedecor, G, w. Statistica] 
A we Collegiate Press, 1946 eds oes 
‘ eS, F. “The Analysis of Multi 1 
ifi η 3. S ái 
ifications with Unequal Frequencies rv the 


Different Classes, » Journal of the Ameri- 
7-9! the Ameri 


can Statistical Association (1934). 

4. Stevens, W. L. “Statistical Analysis of ἃ t 
Non-Orthogonal Tri-Factorial Experiment, 

iometrika, XXXV (1948), p. 346. 

5. Deans Peggs, A. ακοή, τ Variance 
with Unequal Numbers, ^ British Journal 
of Psychology, Statistical Section, IV, 
Part 11 (1951), 7 


p. ΤΊ. 


ESTIMATING COMPONENTS OF VARIATION 


IN AN EXPERIMENTAL STUDY 
OF LEARNING 


WILLIAM HARRISON LUCOW 
Lord Selkirk School 
Winnipeg, Manitoba, Canada 


Introduction 


THE PURPOSE of the investigation 
le Self-contained experiment was to examine 
the Variances arising from two approaches to 
y earning of introductory high school chemis- 
tios, wd to present the formulas and calcula- 
their sa Which the components of variance and 
ti iducial and confidence limits might be es- 
ed eel The differences in variance were test- 
Varia sc nificance, and Model II analysis of 
h € was used to determine the components. 
test E Change in variance from pre-test toafter 
ed {ο τν Criterion examination was consider- 
Mean. Of greater import than the change in 
lance ? under the assumption that greater var- 
individu, à group indicated greater expression of 
might ual differences. Where a manufacturer 


in 


to the Wish to eliminate factors that contribute 
*duea rarlance of his product, the democratic 
t Onist might wish to avail himself of fac- 


of contribute to the variance in performance 
Varia PD ils. This experiment was a study in 
techni On and an example of the application of 
termine and formulas appropriate to the de- 
Comp tion of the components of variance ina 


m 
Ü ins population. 


Sara; 
Tni Methods 


T 
try we, Contrasting methods of learning chemis- 
Coulg μι both “real” in the sense that both 
Sehooj.. found in operation generally in the high 
ete ap of Manitoba. One was a textbook-cent- 
renter Proach and the other was a laboratory- 
Pipa.; APProach, The distinction was one of 

15 rather than of abstraction. 


Bo 
ula: 
lation ang Samples 


Two dist; 
qe exp distinct populations were chosen, and 


Timent was ru h. 
[v à n separately for each. 
Sats V Pülation consisted of **accelerated" stud- 


εν atr o wowed a course designed for univer- 
ot of “no Culation; the other population consist- 
Not ,, "Accelerated" students taking a course 
icient immediate credit for university 


entrance. 

The classes of 1952-53 were taken as sam- 
ples and shown to be representative of the fore- 
going populations. The sample of accelerated 
students numbered thirty-six, eighteen of which 
followed the textbook-centered approach, and 
eighteen the laboratory-centered approach. The 
sample of non-accelerated students nu m bered 
twenty-four, which made up a group of twelve 
for each approach. 


Design 


Each group was randomly divided into s ub- 
groups of three. Thus, two treatments (meth- 
ods of learning) were administered to six repli- 
cates (random sub-groups) each containing three 
individuals. The measure of any individual, ex- 
pressed in the form of a score, Xhij, was a 
combination of the effects of treatment, repli- 
cate, interaction of treatment and replicate, and 
experimentalerror. Expressed in the form of 
an equation, 


Xhij = p &h+ Bi + & Bhi + Ehij 


where u is the general mean, αμ is the effect 

of the hth treatment; Bj is the effect of the ith 
replication, and œ Qni is the effect of the hith 
interaction between treatment and replication. 

€ hij is the effect of general experimental error. 
In this study, the parameters of major interest 
were not the foregoing variables, but their var- 


jances: 


oa, As σαρ, and σὲ respectively. 
They are the components of variation. 

For the accelerated pupils, Xhij was any 
score according to the following design: 


Laboratory 
Method 


Sub- Textbook 
group Method 


I X111 X112 X113 X211 X212 X213 
π X121 X122 X123 X221 X222 X223 
πι X131 X132 X133 X231 X232 X233 


266 JOURNAL OF EXPERIMENTAL EDUCATION 


X243 
IV X141 X142 X143 X241 X242 

V X151 X152 X153 X251 X252 X253 
VI X161 X162 X163 X261 X262 X263 


The analysis of variance table for the acceler- 
ated pupils ( six sub-groups, two treatments, 
and three individuals in each sub-group) is giv- 
en on page 268. 

A similar design was developed for the non- 
accelerated pupils (four sub-groups, two treat- 
ments, and three individuals in each sub-group). 

The design was based on the randomized 
complete blocks model given by Ander son and 
Bancroft. (2) 

The E(MS) column in the Model II analysis of 
variance table mentioned above, represents the 
expectation of the mean square in the population. 
For the error variance, 9&, the average value 
of the mean square of the three measures with- 
in each sub-group is taken. The expectation of 
the mean square of interaction between the two 
treatment means and six sub-group means is: 


σὲ + SX2x6 


2 2 2 
3x6 Sth = σε + 3ofy 


The expectation of the mean square of the two 
treatments means is: 


2 | 3X2X6 2 3X2x6 «5 
σε + E Ot + E m Ξ σε + 30fp + 1802. 


The expectation of the me: 


c an square of the sub- 
group means is: 


3x2x6 3x2x6 
0 + RE οἳ "m ob = σὲ + 30 + θσῇ. 


These calculatio 


ns follow the rule enunciated 
by Crump. (5) 


S 


tatistics Relating to the Estimated Components 


dard error of estimate, 
dence intervals are pre- 


ulas are give 7 
son and Bancroft (2) wh Siven by Ander 
Fisher and Bross, Th 
its for a variance component are: 


(Fo/F2) - 1 


E vene |--σωευ-: } s. 


Fi (Fo/Fj)- 1 


(Vol. XXII 


The population confidence interval for a vari- 
ance component is: 


(Fo/Fa) - 1] sc ee | 8ο/51) -1 te 
ee coc i; 


In the foregoing formulas, Fo is the ratio, - 
Vi/V4, obtained from the data, Fy is F,95 wit 
nj and n4 degrees of Íreedom, which equals m 
1/F.05 with n4 and nj degrees of freedom. (NO 
the reversal in the order of degrees of freedom 
when stating F. 95 in terms of F.05. In order 5 
find the proper values in the F-table, the degree 
of freedom must be used in the sequence given.) 
F2 is Ε 05 with nj and ng degrees of freedom. 
F1 is F, 95 with nj and œ degrees of freedom. 
F'2 is Ἐ ϱ05 with nj and œ degrees of freedom. 6 

It is not uncommon that the value of a compon 
ent should turn out negative. When this occurs, 
the value is taken as zero. 


Practical Application of Model II (9) 


Model II follows exactly the same pattern oe 
Model I in the analysis of variance table up to 24 
end of the mean square column. In Modelllthe 
is added the column listing expressions expecte 
to be equal to the mean Square in the population: 
Thus, if repeated samples had been analyzed to 
yield mean squares between methods, for in^ Id 
stance, the average of these mean squares vo 
be taken as the expectation of the mean square 
the population. he 

Table II shows the analysis of variance of dej" 
results on the criterion after-test by the acce 
ated groups. The unknowns in the expressions | 
in the last column of the table consist of VARIR j 
Which are considered to be the components of V? 
lation in the population, Estimation of these CO 
ponents is accomplished by equating the m UR 
Squares to their expectations and solving for t 
unknown variances. f 

Starting at the error line, the mean square? 


the error forms the estimate of the population 
error variance, 


ὃς = 395.1111, 


fp = (659. 7167 - 395. 1111) +3 = 88.2019. 


The treatment (methods of instruction) var! 
ance estimate in the population is obtained bY 
Subtracting the interaction mean square from z 
the treatment mean square and dividing the dif 


9 


1 


March, 1954) LUCOW 


TABLEI 


SYMBOLIC TABLE FOR USE IN COMPUTING COMPONENT STATISTICS 


Source d/f Mean Square Expectation MS 
1 n =(r- 1) Vi σξ + Κσῃ, + ckop 
2 πο = (c - 1) Vo σὲ + kof, + rkor 
3 ng = (r - Qc - 1) V3 σὲ + Koby 
4 ng = (k 1) rc V4 σὲ 
TABLE II 


COMPONENT ANALYSIS OF VARIANCE TABLE FOR THE CRITERION 
EXAMINATION AFTER-TEST OF ACCELERATED PUPILS 


= d 


Sum of Mean 
Source d/t Squares Square Expectation MS 
Blocks 5 1934. 1389 386. 8278 σὲ «30$ 60i 
Treatments 1 6. 2500 6. 2500 o? + 804, + 180% 
Interaction 5 3298. 5833 659. 7167 o? + 30, 
Error 24 9482. 6667 395. 1111 σὲ 
Total 35 14121. 6389 


o o 


267 


268 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. XXI 


Analysis of Variance Table 


Source afi SS MS E(MS 
B Sub-groups 5 SSB MSB σῇ + 3ofy + 608 
T Methods 1 SST MST oR + 308, + 1807 
TXB 5 SS (TB) MS (TB) σε + 302p 
Error 24 SSE MSE σὲ 
Standard Errors of Estimate 
2 2 = 
^ 2V4 δ; .1 /2V3 2V3 
$- Son, = = d 
FS n4+2 tb k Y ng+2 b n442 
$2.1 / 2v$ 2vá 3-1 | Ὃν ove 
St pe eN gad 1 3 
f rk V n942 ng +2 στοκ mu os 


ference by 18. 
* 18 - -36.3037. 
Zero.) 


Thus, δὲ - (6.2500 - 659. 


7167) 
(This estimate is taken 


as 


T 
- 659.7167) + 6 = 745. 4815, 
result is negative, the estimate 
component is taken as Zero, 
Applying the formulas for Standard error, 
fiducial limits, and confidence intervals to the 


ent estimate 


4. 93, the fiducial limits are (0. 00, 787. 88), and 
the confidence interval is (0, 00, 864, 47), 


The Use of Components of Variance 


Sign itself, particularly in the size of sup 
taken as replicates, It may be Worthwhile to 
investigate the possibilities 
would make use of entire classrooms as 
plots. 


A knowledge of Components of variance may 
be used to reduce variation where variation oe 
not wanted or to increase variation where var 
ation is wanted. These changes in variation - 
may be effected by altering the conditions in 
volved in the situation that produces pt d 
A knowledge of components may help io len 
Precision to the process of altering variation, 


Thus, new Promise is added to educational eX" 
perimentation, 


The Measuri Instrument 


The criterion examination in chemistry wae 
developed during a pilot Study a full year pre. 
vious to the commencement of the experimen’ 
This examination was valued at 216 marks di 


cepts and Principles 
and interpretation. 
Out was the multiple-response variety of the 
multiple-choice form, presented to the pupils 
as “the whole truth and nothing but the truth 
type of item, The sample item read: 


Three times five is more than 
(a) 5 


ΠΕ 


March, 1954) LUCOW 


=" 


269 


ACCELERATED PUPILS NON-ACCELERATED PUPILS 


Textbook Laboratory Textbook Laboratory 
Group Group Group Group 


Pre- After-| Pre- After- Pre- After- Pre- After- 
Test Test Test Test Test Test Test Test 


Figure 1 


Mean + One Standard Deviation for All Groups on the Pre-Test and 
~ Ater-Test of the Criterion Examination 


270 JOURNAL OF EXPERIMENTAL EDUCATION 


i the 
upils were instructed that (a,b) was i 
e e iE response, and that any combin- 
ation from one to all choices might be the re- 


sponse called for in the items of the examination. 


The examination was administered as a pre-test 
and as an after-test. 


Comparison of Pre-Test and After-Test 
Variances 


The t-test used to compare variances was: 


t.a 2= 651 - 55) (N= 
S1 - S2 faa - r?) (53 x sĝ) 


The confidence interval for a coefficient of 0.95 
is given by the expression: 


2 
SI (k - VRT) $1 Sk; VE -1 
s$ 9$ 


2 degrees of freedom 
= t) - 0.95, 


tio of the variances, 


. with a 95% confidence 
interval of (2. 1137, 6. 0248). This meantthat 


the ratio of the after-test variance to the pre- 
test variance in the population could be as low 
as 2. 1137 to as high as 6. 0248. With a Signifi- 
cance level of 0. 05, all hypothesized ratios of 
the variances smaller than 2.1137 and larger 
than 6. 0248 would be rejected, Thus, and as in- 
dicated by the t-test, a ratio of one or the hy- 


pothesis that the variances were equal would be 
rejected, 


interval of (2, 1246, 8. 1944). 
With the non-accelerated pupils the difference 
was marked. The variance of the 
of non-accelerated pupils actually 
from 150.27 inthe pre-test to li 
after-test. The t-test Showed this 
to be non-significant. The variance 


ratio was 
1.0671 with a confidence interval of ( 


0. 5063, 


(Vol. XXII 


. 2489). : 
$ AEN of the laboratory group of ce 7 
accelerated pupils increased from 85.66 to5 i 
an increase significant at the one percent level . 
The variance ratio was 5. 9675 with a confidence 
interval of (2. 6097, 13. 6450). "m 

Figure 1 shows the results of the οκ eeu " 
examination in terms of means and standard e 
viations. The increase in means from prerie 
to after-test was in all cases highly pipni 
(one-tenth of one percent level). With the Bees 
erated pupils increase in variance was signi d 
cant for both groups. With the non-accelera dn 
Pupils, there was no significant change in MEN 
iance with the textbook group, but a significan 
increase in variance for the laboratory group. 


Summary and Conclusions 


This paper has indicated the formulas = 
procedures that might be used in analyzing d 
Yariance in educational test results. The ie 
cedure is that of Model II analysis of varianc me 
Suitable for the determination of variance co 
ponents, A 

The experimental results show that in the ini 
ulation sampled of high school chemistry pup in 
the brighter, or accelerated pupils, inenen det 
variance as groups whether they use the tex tudy 
approach or the laboratory approach to the S 
of chemistry, ils 

The less bright, or non-accelerated pupi rà- 
9n the other hand, profit more from the abos as 
tory approach insofar as increase in varianc 
a group is concerned. 

The writer recommends that the laboratory 
approach be used for all pupils. . car 

It is here suggested as a postulate in Le s- 
tional philosophy that great variation in 618 


5 eo 
room achievement is evidence of the releas 
individual dif. 


-e 
3 research might be the p 
amination of Studies that have been made, dor 
have had a Suitable statistical design, f° 


indications of methods that yield greater vari” 
ance, 


| 


M 
arch, 1954) LUCOW ses 


REFERENCES 

1. Anderson, Kenneth E. ‘‘Improving Science 6. Crump, S. L. “The Present Status of Vari- 
oe Through Realistic Research, ’’ ance Component Analysis, ’’ Biometrics 
Science Education, XXXVII (1953), pp. 55- VII (1951), pp. 1-16. παρα, 

2. ed 7. Johnson, Palmer O. Statistical Methods in 
^ erson, R. L. and Bancroft, T. A. Statis- Research (New York: Prentice-Hall, Inc., 
cell Theory in Research (New York: Mc- 1949), 377 pp. 

3. pa a Hill, 1952), 399 pp. 8. Johnson, Palmer O. **Modern Statisti i 
arr, ArvilS., Davis, Robert A., and John- à τ ΝΑ μμ γον ο 
Son, Pal, y Be , ence and its Function in Educational and 
Appraisal (N o. Educational Research and Psychological Research, " Scientific Month- 

à 362 =, ew York: Lippincott, 1953), ly, LXII (1951), pp. P 

; Bross, Irwin. ‘‘Fiducial Intervals for Vari- 9. Lucow, William Harrison. The Use of Anal- 
ance Components, Biometrics, VI (1950), ysis of Variance in Estimating the Compon- 
PP. 136-144. πα ents of Variation in an Experimental Study 


of Learning. Unpublished Ph.D. Thesis, 
College of Education, University of Minne- 
sota, 1953. 


ον S. L. “The Estimation of Variance 
somponents in Analysis of Variance, " Bio- 
Strics Bulletin, II (1946), pp. 7-11. 


( 
A PROCEDURE FOR ANALYZING A TEST AND 
MAXIMIZING ITS RELIABILITY 


ANGUS G. MACLEAN and ARTHUR T. TAIT 
California Test Bureau 
Los Angeles, California 


T 
for oo. PRESENTATION ofa procedure 
ity is aaa a test and maximizing its reliabil- 
computati ed into three parts: (1) overview and 
to illust tonal procedure, (2) a fictitious example 
basic ο rate the procedure, and (3) review of 

«S,  OnCepts, 
teney crabllity" here refers to internal consis- 
ms apg c MOgeneity. It is assumed that all it- 
Not qj;,-. SCOred 1 or 0, and that the items are 

Srentially weighted. 


et : 
data ne echnique produces in one operation all 
cessary for; 


a 
à; ie certaining item 'difficulties', so that 
and th may be placed in order of difficulty, 
Sym e desired distribution of test scores 
s eee about the mean, negatively 
ing iter, etc, ,) may be obtained by select- 
(2) ses items on the basis of difficulty. 
Conte ne items on the basis of their 
o Jono song to error of measurementand 
ms emi HA, in order to reject those it- 
Ὁ reli ich contribute more to error than 
rect alty. This procedure is more 
lon d S than the use of item-test correla- 
ο mol and appears to involve less, 
mation e labor, since much of the infor- 
1) iud is being obtained simultaneously 
ial Pi for (3). However, point-biser- 
is relations may be obtained on the 
ditio, 9 K-sheet by carrying out one ad- 
(3) ς anal operation. 
(4 eney) ae the reliability (internal consis- 
)c input the complete set of items. 
Subs, ΠΕ the reliability of any desired 
(5 Subset = items, including that of the best 
) Obtains rom the point of view of reliability. 
(6) 284 devie the mean, variance, and stand- 
) Calen tion of the test. 
(if the iting the inter-item correlations 
With a €st is not found to be homogeneous) 
view to further subdivision of the 


bo k 
tests Sed test into highly homogeneous sub- 


T Y reclustering the items. 
NI des i 
Vey, d but 


"beg i 


to sep valent hand procedures may be de- 
up the F-matrix, which is no more 


e Criptio b Fo i ip= 
n below is in terms of IBM equip 


than a frequency table. 


1. Computational procedure 


a. Punch all item scores (1 or 0) for eachin- 
dividual in the sample on IBM cards. 

b. Use an IBM sorter, electronic statistical 
machine, or tabulator to generate an F- 
matrix, defined as follows: 


row 1 and column 1 refer to item number 
1 and the cell-entries consist of the total 
number of correct answers, or frequency 
(f) of **passes''; 


in row 1, cell 1, the entry records the 

number passing on item 1 alone; in row 1 
cell 2 (as in column 1, cell 2) is entered 
the number of persons who gave correct 
answers to both items 1 and 2. 


Thus, entries on the principal diagonal 
(top left to bottom right) give the ‘‘f-val- 
ues’’ of the n items in the test. These may 
be converted to p-values by dividing by N, 
the total number of cases. ‘‘P’’ isacom- 
mon index of item difficulty, usedto place 
items in order of difficulty. Also, (p-p2) 
is the variance of the item, and /p - p? 
its standard deviation. 100p is the per- 
cent passing, and is sometimes used in 
reporting to avoid decimals. 

The side entries (non-diagonal entries) 
are analogous to cross-products in correl- 
ation. Divided by N they give the propor- 
tion of subjects who obtained correct ans- 
wers to the two items indicated by their 
row and column number. The two tri- 
angles formed by the non-diagonal entries 
will be symmetrical, i.e., fj; = fji. 

Diagonal entries will be denoted piiand 
side entries pij, it being understood that 


i£j. 

c. Sum all the entries (f-values) in each row 
and record each row total in a column de- 
notedas (a). These entries may be denot- 
ed Zfr,. 


d. Sum all the entries on the principal diag- 


274 


h. 


i 


JOURNAL OF EXPERIMENTAL EDUCATION 


onal and record their total. 
denoted as Σ111. 


This sum is 


. In a new column, denoted (b), enter the 


σῆτ' or item-test covariance value for each 
row. The computing formula is 


ire 4 (Str; - ΣΙ) 


where fij is the diagonal entry in row i 
and Zífr; is the row total for item i already 
entered in column (a), 


- Sum column (b) and record the total, This 


value is On, the total variance or vari- 
ance of the whole test. 


- Obtain the corresponding σᾶ; for each di- 


k. 


agonal entry, fi, by 
2 η $3142 
Ci; = Pii - pfi = Ah R8) 


and record in column (c). 

Subtract the entries (row by row) in col- 
umn (c) from those in column (b). Enter 
in column (d). These entries constitute 
DLET , i#j, or the Sum, for item i, of 
its covariances with all other items, 

Sum column (4) and record the total. This 
Sives us σῇ, or 2-1 times the total 
amount of ‘true’ variance present. As a 


check, sum column (c). The latter total, 
when added to that for column (d), should 
equal the total for column (b), or 


ση, = 
Record in the last column, 
ences obtained by Subtracting the entries 
in column (c) from those in column (q), 


i.e., obtain Si the selection index (for it- 
em i) by: 


2 2 
95 * X9 


(e), the differ- 


διτὃ ση -oh, (4j) 


All items Contribute to both error and true 


€ correlate positively. 
negatively with the other items Aj 


an item’s variance (p - 


sum of its covariances, 


11, with all other items inne HE)» 
then that item is actually lowering the re- 


(Vol. XXII 


liability of the test. Indeed, some items 
have a negative covariance total. Many 
tests in actual use do contain such items. 
As an example of how drastic this effect 
may be, one proposed test of 25 apparent- 
ly homogeneous items, analyzed by this 
method, proved to contain 17 items whose 
S-index was negative. The reliability es 
timated by the Kuder-Richardson formula 
20, was .297. When these 17 items were 
Struck out the remaining eight items pos- 
Sesseda reliability of .550! The total 
variance was only halved, while the true 
variance actually increased because of ihe 
rejection, among items whose covariance 
totals were smaller than their variances, 
of some items which actually had negative 
totals. 


1. Having reached a final decision as to 
which items are to be retained, ap m 
totals for columns (b) and (d), or total 
Variance and true variance are wl 
ed, using only the chosen items. = 
the reliability coefficient, rtt, base 
on these items, is computed: 


rit = 2- 


Th 
n-i στ 


Where n - the number of items. 


Two qualifications are in order; sc 
above formula, known as Kuder-Ric 
ardson formula 20, gives the lowe ai 
limit of the true reliability, rather less 
the reliability itself. It is neverthele? 
the most widely used estimator of det- 
ternal consistency. Secondly, the 
inition of reliability is 


τη = true variance 
total variance 


cor^ 


yet the K-R formula requires pk the 


rection factor wer > implying tha 


e 

total for column (d) underestimates j 
true variance, However, referente, 
has been made to it as ‘true variany. 
for the sake of simplicity and lucidi T 

If Horst's (3) recent formula i$ A 
ferred it is à Simple matter to 7? dine 
the item~difficulties (p’s) in descer 
order of magnitude, to multiply ©") pd 
by its Corresponding rank-numbeT: "ye 
to sum the products to obtain Zipii 
formula then is: 


c? σὲ 
"πα, ΠΒ... 
σᾶ o? = Xo? 


T m ii 


Pe 


March, 1954) 


where 


2 S 
Om = 22 ipii - Epii (1 + ΣΡ) 


To sum up: 


Column (b) provides the item-test 
covariances, and their sum provides 
the variance of the whole test, o?p. 


Column (c) records the item variances 
Which can never exceed .2500, but 
Which, for optimal discrimination, 
Should approach . 2500. Asa rule an 
item variance should not fall below 
-1000, because then the item wouldbe 
SO easy or so hard that it would not 
be very useful. An exception could 

be made, of course, for the first few 
and last few items. 


Column (d) records the inter-item 
Covariance-sums, and its total is 04, 
which when multiplied by wr’ esti- 


mates the true variance of the test. 


Column (e) lists the S-indices which 
identify the items which are detract- 
ing from reliability, also those which 
are contributing little, and those which 
&re the most desirable. 

Once the items are chosen compu- 
tations may be made for the mean, 
Variance, standard deviation, relia- 
bility and standard error of measure- 
ment of the test. If it is desired to 
record the item-test correlations, the 
following formula may be used: 


MACLEAN - TAIT 


275 


This is a point-biserial coefficient. 
(oj-p is the entry in column b for item 
is} 

The mean of the test is equal to Zpjj, 
or, since the sum of the f-entries on 
the principal diagonal is already re- 
corded, the mean = >fij . 

N 

The item ‘‘difficulties’’ are simply 
obtained by converting the entries on 
the principal diagonal to proportions. 

If for any reason inter-item correl- 
ations are desired, 


rij = Pij τ ΡΗΡΗ 
Y (b; - Pih; - 0j) 


where Ρῃ is the proportion of subjects 
giving the correct response to item i, 
Pii the proportion correct on item j, 
and Pij the proportion correct on both. 
This formula results in a phi-coeffic- 
ient. Ifthe overall homogeneity proves 
to be low, these item-intercorrelations 
may be used to make a loose cluster 
analysis, in order to break down the 
test into homogeneous subtests. A 
formal factor analysis would usually 
involve too much labor for this pur- 
pose, and phi-coefficients are only 
roughly comparable for items of vary- 
ing difficulty. 1 It is advisable to carry 
four places of decimals throughout. 


e ——— 


2, 

Example 
ap A fictit 

n Toces 

F-matri 
ttem atrix 
2 3 4 5 
80 


1 
9 
66 
3 
59 
4 
31 
Ἂς, 
1 
te 
ὃν E 


= decided to perform such a re-clusteri 
NE each by its corresponding Ünex- 


ious five-item test yields the following information when the results of a try-out on 100 cases 
Sed: 


Analysis 
b c d e 
4620 .1600 3020 .1420 
.2612 .2176 0436 7.1740 
4874 «28319 2495 . 0116 
5788 «2116 3612 . 1436 


ng, it is advisable to correct the inter-item O's 
See Guilford (1). 


JOURNAL ΟΕ EXPERIMENTAL EDUCATION 
216 


24 24 25 106 .3950 .1875 . 2075 
5 23 10 
ΣΗ1 = 266 926 2.1844 1. 0206 1.1638 
` ü- 
v 
My = Σρι = 2.66 Checks y 


Computations for item 1 
- - = . 4620, 
Column b: σα. (or Ly py Ip) 7100 (259 - .80 x 266) 


1 " x 
Check on total: oF = 100 (926 - 2.66 x 266) - 2.1844, 


Column c: n = .80 - .803 =. 1600. 


Column d: zoi (covariance total for item 1) = . 4620 - 


«1600 = . 3020. 
Check on total: σῇ + DoF, = ση. 
Column e: 81 = «3020 - . 1600 = . 1420, 
: ES; + 2207, = oĉ, 
κ. di x T item-test Covariance 
Item-test correlation for item 1, a point-biserial Ἐν = 


item variance x test variance 


.3020 
= = = 
"IT = /71600x2.184j = -511. 


Covariance between items land 3 = P13 ~ P11P33 = .59 - .80 x -61 = . 1020, 


Correlation between items 1 and 3, a phi Coefficient, = - 1020 


--Ξ---------- = 
-1600 x. 2379 = - 523. 


Computations for the test as a whole 


Mean = ΣΡΙ = d = 2.66. 


Variance - on = 2. 1844, 


Standard Deviation - Y2. 1844 = 1.48, 
Order of difficulty: Correct as it is, 


Reliability (K-R 20): rg 5. 1. 1638 


> 1847 = .666. 


2.1844 ^ 


(Horst):Zipi- (1 x :80) + (2 x 68)... ete, EB, 52. 
om 713.04 - (2. 66)(3. 66) = 3. 3044. 
ra 1.1638 3.3044 


" .304 
"ESO πας 2304345 γη. 
3.3044 - 1.0398 2 i844 = T1 


| 


l March, 1954) 


MACLEAN - TAIT 277 
ltem selection iances, 
On the basis of the S-indices, item 2 mi Ξ fij - fu, fj 
à e ght dij = Pij 7 Pii Pj = ο e. a 
® rejected. The new total for column a is: 3 7H citjj NO à N N 


3 + 174 + 117 + 96 = 580. 


The new Zífi; is : 286 - 68 = 198. 


The new c£ = 5.80 - (1.98)? = 1.8796. 


The new zo; = 1,0206 - .2176 = . 8 030. 


Then 


is, b the reliability of the remaining items 


Y the Kuder-Richardson formula: 
Ttt = i : 1. 8796 -. 8030 764. 


1. 8796 


sip mission of item 2 which has the negative 
.19 5X has raised the K-R reliability by about 
ance Sven though it has dropped the total vari- 


By th ; 
o ag, 0596, rst formula, whenZip;j= 3.98 and 
rg - 1.0766 | 2.0596 _ 
ius" Lege t3 


erea S Value represents a considerable in- 
taineg -> reliability over the value of . 771 ob- 
s T to the deletion of item 2. 

®Stimat Well known, the K-R formula 20 under- 
Sümpti 55 the true reliability when certain as - 
that "Ps are violated, such as the assumption 
formu item difficulties are equal. Horst's 
atten, (;Lelds a coefficient corrected for the 

It ig coco" due to variation of the difficulties. 
atteny, in the above example how great this 
ieu on may be, even in 4 items whose dif- 
Sitten lie between . 80 and . 20, i.e., whose 

ies are by no means extreme. 


T! 
Mew of basic concepts 


Th 
Lere qe e mputational formulas given above 
eo matrix Ved to facilitate working from the raw 
tari mids rather than from the item variance- 
Tmeg, ` Matrix into which it can be trans ~- 
it, Thi 
vem mer is derived from F as follows: the 
ins, bs will lie along the principal diag- 
8 fre Will be derived from the correspond- 
quencies by: 


2 
€ sig 
e A : 
entries will be the inter-item covar- 


ὃ 


Then the following statements are true: 

The sum of the entries in row i, including 
the diagonal entry, is the covariance of item i 
with the total test. 2 

The sum of these sums, or the sum of all 
entries in the matrix, is the variance of the 
total test. 2 

The sum of the diagonal entries is the term 
Zof; often written Epq, the sum of the item 
variances. 

The difference between the last two terms, 
σὲ - Zofi, is σῇ, or the sum of all the inter - 
item covariances in the matrix. It is therefore 
the ‘‘amount of homogeneity” present. 

The sum of the entries in row i, excluding 
the diagonal entry, is the element that has been 
denoted as fei , and the sum of these sums is 


σῇ . Σσῇ has been referred to as the covar- 


iance -total for item i, and, if this is larger 
than the variance of item i, then i is evidently 
contributing more to the reliability of the total 
test than it is to error variance. If thereverse 
is true, then removal of the item will result in 
an increase in the reliability. 

The inter-item and item-test covariances 
may be converted into phi and point-biserial co- 
efficients respectively, by dividing them by the 
geometric mean of the relevant variances, since 
a covariance coefficient is equal to the corres- 
ponding correlation coefficient multiplied by the 
product of the standard deviations of the two 
variables concerned. The necessary variances 
are already available. 

Thus, the procedure, suggested above, of 
summing the raw frequencies across the rows 
and along the principal diagonal, and using mod- 
ified formulas, eliminates the tedious process 
of converting every element in the F-matrix in- 
to a variance or covariance term. In addition, 
the method as a whole is an especially power- 
ful and precise technique for analyzing an exist- 
ing test or for improving a test under develop- 
ment or revision. Also, it is highly desirable 
to have all data concerning a test computed in 
one operation and recorded on the same sheet 
preferably on the IBM data sheet. Where this 
is done, all data concerning the test are mutu- 
ally consistent, and, if the initial punching is 
verified, the basic data are completely reliable 


Ew 
8e 
two statements are based on findings reported by Gulliksen (2). 


218 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXI . 


for all the different computations, which are 
too often done at different times by different 

workers, with minor discrepancies and much 
duplication of computation. 


REFERENCES 


1. Guilford, J. P. Fundamental Statistics in 
Psychology and Education (New York: Mc- 


Graw-Hill Book Co. , 1950), p. 343. 


2. Gulliksen, Harold. Theory of Mental Tests 
(New York: John Wiley and Sons, 1950), 
pp. 376f. 


3. Horst, Paul. “Correcting the Kuder-Rich- 
ardson Reliability for Dispersion of Item 


Difficulties,» Psychological Bulletin, L 
(1953), pp. 371-374, 


ΤΗΕ LEAST-SQUARES ANALYSIS OF A pxqxr 
FACTORIAL DESIGN WITH UNEQUAL 
SUBCLASS FREQUENCIES’ 


RAYMOND O. COLLIER, Jr. 
University of Minnesota 


In 


"Xroduction 


Plague ANY instances experimenters are 
hich contre the problem of analyzing designs 
fortunat 1 in unequalsubclass frequencies. Un- 
having ¿ J> in the analysis of such designs 
f the efte or more factors, non-orthogonality 
them a ects makes it impossible to estimate 
Case or Pi η as is possible in the órthogonal 
One ος al Subclass frequencies. 
Wit dca re practice consists of a random 
fects is = of cases so that orthogonality of ef- 
tio iñ Obtained. However, thereare situa- 
τοις dene this procedure results in a ser- 
Of the Pie in the error degrees of freedom 
(This is . δη such that analysis is inadvisable. 
ests or Particularly true if the power of the 
τς, o, nificance are considered.) Further- 
τρ] ος many investigators this method of 
"ig info, *QUalization is equivalent to discard- 
^ Tmation, 

Sube Jas VO "factor design involving unequal 
(3) Anderso e uencies has beenexplicitly treated 
). ον ἔμ Bancroft (1) and Kempthorne 
tó re, the purpose of this paper will 

ic leasg-, o Sive some general explanations of 
DARNOS Solutions of a three-factor de- 
, hereafter, to present an actual ap- 


Pro} 
En, blem Showing the analysis of such a 


Se 
CRI Remarks 


Fo 
Con, T th 
"Sider three-factor (pxqxr factorial) design, 
Ollowing model: 

l P+O it λος. Μάιο ΙΙΙ € ijkl 
Symp See DI αμ d; Kc 1, 
Eee 2,..., niik; Uis the fixed gen- 
. pe Bas €, B, v are the fixed main effects; 
Inge hie m TT are fixed interaction effects. 

Pendent are assumed to be normally and 
Ν Y distributed with mean 0 and var- 
he 
Tos ®Utho 
Songs Reiger? Parti 

Ray, Side oularly thankful to Dr. Cyril Je 

τν dit Paing this problem, but also for oontinue 


a 3 
ως λος of data collected in a study oon 


d encouragement. 


iance o?. Hypotheses of interest are Ηχ:α1-0, 
H2:Bj=0, H3:ry-0, Ha:njj=0, Hs: 4). 7 0, Hg: 
Gk 0, H7:Tfjjk- 0. 

Notice that the model (1) above includes a 
total of pr + pq + τα + pqr interaction paramet- 
ers. Even in the simplest case possible, i.e., 
when p=q=r - 2, there are 20 interaction 
constants to be estimated—an imposing task. 

Instead, assume that the interaction param- 
eters are zero and rewrite the model as: 


Q) 


and obtain the p+ q + r + 1 estimates for p, aj, 
Bj and y. Furthermore, we will later use the 
complete model 


Yijkl= H+ @ij + Bj +Tk + ει 


(3) 


i.e., one which attributes to each ijkth subclass 
a fixed effect. This provides the determination 
of the sum of squares due to all interactions — 
in other words a ‘‘pooled interaction’’ sum of 
squares. 

To estimate p, aj, Bj, andTy, let us employ 
normal regression theory. The normal (least 
squares) equations are obtained by minimizing 


Yijkl = P+ Qijk + €ijkb 


3€ 1 ith respect to u, α1, B; and 
iat ijkl in (2) with resp u Xi Bj n 
Ty Thus, the normal equations can be written 


as functions of m, aj, bj, and cy, the estimates 
of y, «i, Bj, ἀπάτης, respectively: 


(4) 


m: nm « Znj, , aj En. j, δα, Ck - Y 

i ο 
aj: nj;,m-*ni,, ai + Enij.bj «Eni. k οκ = Y; 
bj: n, j, m+ Znij. ai*n, j, bj+ 2n, jk ες = Yj.. 
cxin, .km+ni.k 238m, jkbj+n, ekz Y, y, 


where n - n,,, and the subscript **dot'' on any 
term signifies a summation over that subscript, 


Hoyt, not only in providing the opportunity 
Dete used in the exemple 
ducted by Dr. Raymond Ge Price and Dr. Cyril 


© Bureau of Bducational Research, University of Minnesota, 


280 JOURNAL OF EXPERIMENTAL EDUCATION 


e.g., nij, = Pnijk. 


The equations of (4), however, are not lin- 
early independent. Therefore, to-solve unique- 
ly for these estimates by any calculation meth- 
od, it is necessary to impose three linearly in- 
dependent restrictions on the parameters, Ex- 
cept for being independent these are quite arbi- 
trary. Notice, however, that if the restrictions 
for (2) 


(5) pn, .«i-0, Pn, j, Bj Ξ0, En. «ΚΤΚΞ0 


are used we have Y /ηΞμ. 


Also in (4), if the restrictions made on aj, bi, 
and ck are 


(6) Eni. . ai = 0, Pn.j. bj =0, En. ke - 0 


then m - Y /n, the general mean, a result 
Consistent with the case in which the subclass 
frequencies are equal. 

Now m, aj, bj, and cy may be determined by 
one of the many methods for Solving simultan- 
eous linear equations, Suppose that we have 
Solved for these estimates, With the present 
Setup we will not be able to t 
mentioned hypotheses, H4, 
it will be possible to test Him;; jk = Tj 
= 0. We know that the likelihood ratio test of 
this hypothesis involves finding 


this model is the “within Subclasses” sum of 


" Pe aii 
(7) SSE "aei Yik ie PY ti) /nijk 


From (3) we note that this third 
equivalent to that of the one-way analysis of var- 
iance and that the sum of squares due to the hy- 
pothesis: Ọijk = 0 is the “between Subclasses” 
Sum of squares (SSB) given by 


model is 


8) SSB = Σ (Yigg)? /nia, - ijk)” 
(8) ijk G ijk1) /nijk (AY /n 
total sum of Squares (SST) 
around the general mean, Y is given as 
er 


n 
SST = SSE + SSB. 


From least-squares regression theory, the 
sum of squares of the estimating Constants, m, 
aj, bj, and εἰς is 


(9) SS(m, a, b, c) = mY, .. Ma 
Poi... 


ZCkY,,k, . 
+ Eck 


Therefore, the reduction in the sum of 
squares due to the hypothesis H4 is 


(10) SS H% = SSB - SS(m,a, b, ο) + SS(m) = 


ik FY iig)" [nii - SS(m, a, b, c). 
Now to obtain the sum of squares due o 

Xj - 0, set the aj equal to zero in the pe - 

equations, delete the a-equations, and 2 riim 

m, bj, bj, bà, bá, οἱ, οὐ, cá— the co 

under the hypothesis. 


1 . 
From (9), SS(m,b,c) = πι. per Tajet 
Σ 
nck Υ..κ. 


Consequently, 
(11) SSH) = SS(m, a, b, c) - SS(m, b, c). 


;20,in 
Similarily, by Setting b; = 0 and then οἱ = 0, 


à fore 
the norma] equations, and proceeding as 06 
we find 


(12) SSH» = Ss(m,a, b, c) - SS (m, a, ο) 
(13) 5559 = SS(m; a,b,c) - SS (m, a, b). 


It is worthwhile to state at this point ce 
à calculation method known as pivotal lr ye” 
sation (see reference 2) is utilized, ceran αρ” 
lationships are made obvious which may 
plied as a method for finding SS Hj, SS H2; e 
H3. In this method, various constants aon”. 
‘‘swept out, ” i,e., deleted by adding cer na 
multiples of equations. Thus, if in our 5 eeswept 
equations both the aj and bj constants are ing €^ 
Out? from the Ck equations, these rema a 
equations will be called the adjusted ο (adj u 
for a and b) equations, A sum of squares h will 
to the adjusted οὓς may be calculated hi for 
be called the sum of squares of c adjuste will 
3 and b or SS (c{ a, b), Similarly, SS(cl he 
refer to the sum of Squares of c adjusted t 
alone, etc, In general it may be shown tha 


sted 


(14) SS(el a, b) + SS(a| b) + ss(b) +SS(m) = 


SS(m, a,b, c) 
(15) SS(415, c) + ss(c| Ὁ) + SS(b) + SS(m) = 


S(m, a, b, c) 
(16) S$(512, c) + ss(a|c) 4 Boe] $88(m) = 


SS(m, a, b, c) 
and also that 
(17) SSH) = ss(a| b,c) 
(18) SSH» = SS(bl a,c 
(19) SSH = ss(c| a^). 


2) gs) 
Tt is clear then that equations (11), br 
or (17), (18), (19) may be used to obtain 


(Vol. XXII | 


, March, 1954) COLLIER 


TABLE I 
ANALYSIS OF VARIANCE 


Source of Degrees of Sum of Mean 
Variation Freedom Squares Square F 
A(Hy: aj = 0) p-1 SSH, MSH, MSH1/s? 
B(H2: Bj = 0) q-1 5550 MSH3 .MSH9/s? 
C(Hg: Y, = 0) r-1 SSH3 MSH3 MSH3/s? 
I: Pooled Interaction par-p-q-r42 SSH4 MSH4 MSH4/s? 
(Hg: ny = Vix = 
stp. = Ἴδῃ. = 0) 
Error n-pqr SSE s? 


TABLE II 


ANALYSIS OF VARIANCE FOR EXAMPLE 


Source of 

aviation d. f. SS MS F Decision 
* 1 1, 665, 016. 17 1,662,016.17 3.77 Aċcept 

a 3 4,229, 771.40 1, 409,923.80 3.20. Reject 

s 2 863, 471. 26 431,735.63 «1 Accept 

Error 72 31,756,570. 43 441, 063. 48 

Pooled Interaction 17 — 8,480,332. 79 498,843.11 1.138 Accept 


LL 


82 JOURNAL OF EXPERIMENTAL EDUCATION 
2 


and SS H3. 
x po εώς Bs of variance may now be written 
as shown in Table I. Because the interaction 
parameters were assumed to be Zero inthe 
model (2), there exist tests for Hi, Ho, H3 only 
if H4 is accepted, 


Απ Example 


The data used in the following example were 


collected ina typing-reading relationships study. 
The three factors included were: 


A: 2 levels of order of presentation of 
typing and reading materials 

B: 4 levels of high school typing classes 

C: 3 levels of typing tests 


The basic variable Yijkl was a typing speed 
Score for the Ith individual of the ith level of 
A, the jth level of B and the kth level of c. 

From (4) we write the normal equations as shown 


the m, a, and c 
equations and impose the restricti 


» We can first solve for 61, C2, and 


> a], 
a2, bi, bo, b3, and b4. This has been done, 
yielding as Solutions: 


» 81=132.817457, ag =~132.817457, 
b1 =~284.950488, by = 89.981636, bg =246.767258, 
b4=-51. 435321 » 627 -121.084942. 


And from (9) we see that the s 


um of squares 
due to the estim 


ates m, 35, b's and ς 8 is 
SS(m, a, b, ο) = 


(4463. 072917)(428455) + (132. 817457)(220015) 
+ (7182. 817457)(208440) 4... . (7106.511481) 
(137539) = 1,918, 744, 217, 773958 


Calculations were first ma 
Subclasses” Sum of squar 
the “between subclasses” 
(8). They were 


de of the **within 
es using (7) and of 
Sum of squares using 


SSE-31, 756,570, 4334, SSB= 


14, 998, 643, 91, 
SS(m) = 1, 912, 225, 906. 65. 


Consequently, from (10) 
SSH = 8, 480, 332, 79 


We should test Hy immediately by consider- 
ing F4 = (d. f. E)(SSH4)/(a, f, HA)(SSE), which is 
distributed as Snedecor's F with ny =d. f. Hy 
and ng = d.f, E. This test was made and the hy- 
pothesis accepted in our example, Thus it was 


(Vol. XXII 


admissable to proceed in testing Hj, Ἡο rin 
As previously mentioned, to test H τ e 

a1 =a2 = 0 in the normal equations, dele des 

a-equations, and solve for the m, b's, ν 

The solutions were made by Sweeping out iy lh 

in the C-equations, solving for the C's, an 

for m and the b's, Actual calculation gave 


m = 4463.072917, by = 63.149084, Μο ον 
b3 = 241. 663150, b4- -29, 075902, cy =18. aig 
C2 =-119.945382, c3 = 115.923281 ΕΝ . 
(4463.072917)(428455) + (-275.670486)(11 νι 

- -+ (115. 923281)(137539) = 1,917,082, 


b, c) 
Therefore, according to (11), SSH, =SS(m, a, 
- SS(M, b, c) 90 
"n ha 144, 217. 77 - 1, 917, 082, 201 
= 1,662, 016. 17. 


=b3= 
Similarly, in testing Hp we set bj = x 3 
b4 = 0 and solve for m, the a's and the c's. 
They were 


m = 4463. 072917, αι = 118, 646549, 
a2 = -118. 646549, Ε1 = -120. 219977, 514 446.97 ' 
°3 = 113. 650725, and SS(m, a, c) = 1,914,514, 


Therefore, according to (12) 
40. 
SSH2 - SS(m, a, b, c) - SS(m, b, c) = 4,229, 71 


d 
Also, to test H3, we set €1-20c22c3-0 Mr 
Solve for m, the ag and the b's, These 


m = 4463. 072917, 81 = 134. 104864, 
22 = -134. 104864, bi - -284. 794693, 
2 = 02.398056, b3 = 950. 723802 0,753.97" 
= -49. 721053 andSS(m, a, b) = 1,917,880, 


Therefore, according to (13) 
SSH3 =SS(m, a, b, ο) -SS(m, a, b) = 863, 471. 26. 


ration of the calculations (19) 
SSH1, SSH», and SSH3 by means of (17), ( τ 
and (19) will be available shortly in mimeo’ 
form for interesteq readers, ) 


aphed 
Finally, 
in Table jT. ince 
One r emaining point should be noted. ο. l 
the effects of this design were non-ortho£ 
iY of the individua] sums of squares 
the tota] Sum of squares Was not present. Í 


d 
nte 
the analysis of variance is prese 


The least-squares analysis of the general 
DXqxr factoria] design with unequal subc dn 
frequencies has been demonstrated. Expla 1ysi? 
tions of Certain interrelationships in the anā 


| March 
» 1954) COLLIER 283 


m: 96m + 48a] + 48a2 + 2801 + 2002 + 29bg + 1904 + 32c4 + 34c9 + 30c3 = 428455 


ay: 48m + 48a, * 1551 + 802 + 1408 + 11b4 + 1561 + 1769 + 16c3 = 220015 
ag: 48m + + 48a2 + 1801 + 1209 + 15b3 + 804 + 17c] + 17c2 + 1468 = 208449 
by: 28m + 1531 + 13a2 + 2801 + θεοὶ + 1062: 963 = 117260 
bo: 20m+ 8aj + 1289 + 20b9 + 6c, + 8c24 6c3 = 90373 
(20) — 53: 29m + 1441 + 1539 +29b, + 9c, + 1000 + 1064 = 136566 
b4: 19m + 1lay+ 8ag + 19043 8cy+ 602 5cg= 84256 
C4: 32m + 15a, +17ag+ 901. 60934 9bg+ 8b4+ 32cq = 143259 
Cg: 34m + 1781 + 17a9 + 1001 + 8b « 1054 - 6b4 + 34c9 - 147657 
€3: 30m + 16a, + 1440 + 9by+ 6b9 + 10b3 « 5b4 + 30c3 = 137539 
and restrictions from (6) are 
48a, + 48a9 =0 
=0 


2801 + 2009 + 2904 + 19b4 
+ 89961 + 3409 + 3003 = 0 


ha 
v " 
ha; Peen made, This method of analysis McGraw-Hilland Co., 1952), pp. 278- 


ot labor appliedto actual data. The amount 284. 
Sort i, COnnected with an analysis of this 
leq, Sreatly reduced by using a systemat- 2. Collier, Raymond O., Jr. “Some Applica- 
tin mp utational layout, rather than by de- tions of the Method of Pivotal Condensa- 
alge raic formulas á tion in Statistical Analysis, " Journal 
` of Experimental Education, XXI (1953), 
pp. 233-241. 
l Ang ΠΠ. 3. Kempthorne, Oscar. The Design and Analy- 
tron, R. L., and Bancroft, T. A. Sta- sis of Experiments (New York: John Wiley 
"Scal Theory in Research (New York: and Sons, 1952), pp. 79-88. 


ON THE PROBLEM OF SAMPLE SIZE FOR 
MULTIVARIATE SIMPLE 
RANDOM SAMPLING* 


WILLIAM J. MOONAN 
University of Minnesota 


0. Introduction 


eit p ESEARCHER who is attempting to 

Problem Sampling design is faced with many 

of decidis Not the least of these is the problem 

to de ς ing which of the variates shall be used 

Questio, ne the appropriate sample size. This 
nis usually resolved either by 


a. ag 
determining the sample size from the 
variate which is considered most ‘‘im- 
Portant, ” or 


; determining the sample size from each 
at Several of the more important vari- 
es and then choosing the sample size 
pom among these numbers. The max- 
mum number is the one which is com- 
monly chosen. 


with ere are certain disadvantages associated 
all, 1 ther of the above procedures. First of 

ls = Often difficult to decide which variate 

In an ‘important’? for purposes of the survey. 
is so Surveys information on several variates 
to tare With possibly equal importance attached 
as it, ^ For instance, a school survey may have 
Which Objective the collection of information 
Status relates to age, health, socio-economic 
Bnizeq nd learning ability. If this fact is rec- 
we a and several sample sizes are determined 
in bar? find they are very discrepant, depending, 
2nd the 9f course, on the variance of the variates 


| the accuracy desired for estimating them. 
hat Som Sest sample size is chosen, it is likely 
With € variates will be ‘‘over-estimated”’ 


Purt em PS considerable trouble and expense. 
tStimates ©? it must be remembered that the 
E istic S are to be determined from the charac- 
‘ls on of the same randomly sampled individ- 
"ot be -> aS a consequence, these values will 


iS 4 
independently estimated. It is the purpose 
+ 


of this paper to show how to find the sample 
size for surveys which sample several charac- 
teristics of the same individuals. 


1. The Univariate Case 


To determine sample size in surveys, there 
are two important circumstances to keep in 
mind. The first concerns the ‘‘accuracy’’ that 
the estimates must possess and the second is 
the probability that the accuracy will be obtained. 
By ‘‘accuracy’’ we mean the distance on the var- 
iate scale between the estimated value of a par- 
ameter and the parameter itself. Great accur- 
acy then signifies a small distance whereas 
small accuracy refers to a large distance. It 
would be of little value to set upa sampling 
Scheme which intends to provide great accuracy 
and then proceed with the execution of this 
scheme if there was little chance that such ac- 
curacy would be achieved. . Alternatively, noth- 
ing is to be gained by sampling with high degree 
of assurance that very small accuracy will be 
obtained. It is possible to bring both accuracy 
and confidence into play by determining sample 
size for single variates by considering the square 
of Student's t. It is known that t-square is dis- 
tributed as Snedecor's F with 1 and n - 1 de- 
grees of freedom, i.e., 

(1) F(1,n - 1; a ) = (x? - 61) S** (αὶ - e! 


where « is the confidence coefficient, x} is the 
sample mean of the variate ät based on a sample 
of size n, θ᾽ is the population mean and 81} is 
the inverse of the variance of x}; therefore, 5131 
= nS!! where S"' is the inverse of the sample 
variance Sy, . 

The value of α is used to reflect the confi- 
dence ye desire to be associated with the accur- 
acy, d' -!χὶ - 6€'l, Since interest is attached 
to the value of n, (1) is rewritten as 


Thy, n 
S Paper was suggested by some survey work at the Bureau of Institutional Research of the Uni- 
oF Mw 


Vers: 
ity Sota, 


9 Sing 
not P V? Shall distinguish different variates by the use of superscripts, powers of numbers will 


t 
be used in this article. 


JOURNAL OF EXPERIMENTAL EDUCATION 
286 


(2) n= F(1,n- 1; x)/d!Sq&- 


There are some difficulties connected with 

solution for n. a 
e be that we arbitrarily Specify, butS,, mustbe 
evaluated from a priori knowledge of the variate, 
a pilot study or from some method Such as divid- 


(3) n= 3.84/2(. 01) 2 = 96, 


Next we find F(1, 96; .95) - 


3.95 and Substi- 
tute this value for 3.84 in (3) 


; then n - 99, This 
ted until Stability 
e for 


For this problem:no-83, 
of the students and a table 


isa Simple matter to ob- 
tain the random 83 Otis Scores and to evaluate 


their mean. This was actually done and xi = 
42.6 whereas, for comparison Purposes, θι = 
49.9. 


Therefore the desired accuracy was ac- 
tually achieved as expected, 


2. The Multivariate Case 
mmm tate Case 


The unfamiliar notati, 


9n adopted in section 2 
was used in order to pr 


Ovide a Smooth transi- 


2. Harold Hotelling, m 


ET 3600576 The Generali zation of Student's Ratio,n Annals o 
1931), ppe s 


(Vol. XXII 


tion from the univariate to the τον eee 
cases. The generalization of ec ab ie 2. 
made by Hotelling in 1931.2 This aa ἃ is pro~ 
called T, and T-square, i.e., di53 dl, ie diss 
portionately distributed as a variance r: 
tribution. In fact, 


7p; c) - (n - p) αἱ 51} dl/p(n - 1) 
© F(o,n-p;cr)- (n-p)dis Pod τμ. 


; 18. 
p is the number of variates, n is the. d end - gi 
Size; x is the confidence coefficient; f ΠΒ ma- 
- - XP - 6P) and d) is the transpose o: ariance- 
trix; Sij is the inverse matrix of the i Cone 
Covariance (varcovar) matrix of the x: S se of 
sequently, 81] = nl] where sij is the inv 
the varcovar, Sij, of the variates. > nis the 
It is possible to solve for n in (4); ‘equation 
largest root of the following quadratic 
in n: 


3.53553 T * 
(5) n [aisijai] n - np [disiiai + F(p, n pia) 
pF(pn-p;«)- 


- = 080 in 
If we are willing to replace nSíi by (n 
(4), then the value of n is 


(6) n-p[. F(p,n - p; «)/aigiiaj]. 


be 

For practical applications the Sij vicam 
determined by methods which were men on n. 
before. Also, we notice that F depends imate 
Therefore, we must Successively ως last 
nina manner similar to that shown in can be 
illustration. The arithmetic peace oe” 
Shown if we consider that we are μάς 
edge of two characteristics of the chemi nta 
Students. Let the two variates be the i chem" 
ability as measured by the Otis test an ized 
istry ability as measured by a standardi 
chemistry test. 


d the 
Suppose we are interested in obtaining 
accuracies 


(7) di = (xı -ϱι x? - 02) = (2 3) 


witha Confidence of - 99. 


Furthermore, 
(8) S S 90 
Sij- n 19 κ 100 and 
δει. Sap 90 200 
zi -016807  -.007563 
SH. [ 
7. 007563 .008493 J . 


e 
We notice that the correlation between th 
two variates is taken to be . 64, i.e, .6 


; Ir 
f Mathematical Statistics, 
ematical Statistics: 


March, 1954) 


9 
04100x 200. Since F(2,00, .95) = 2.99 and 


9 les; ^" 
© aisijai = (2 3) [ 016807 . 007563) [2 
-. 007563 -. 008403) \3 
= .052, 


We 
ὋΝ ρω (6)asn=2 [1 + 2.99/.055ἱ = 117. 
95) νο ting F(2, 117; . 95) = 3. ΟἿ for F(2, αν, 
Size loft. n = 120 and ng = 97. If the sample 
Vidua] een determined on each variate indi- 
ue y and independently, then, no, = 83 and 
tivariate Notice that the sample size for mul- 
Size dete Sampling is larger than the sample 
is is Freien for the variates separately. 
Variates 25 i9 bwo factors. The first is that the 
€ fact πη correlated and the second is due to 
δὰ fron; t the variates have not been evaluat- 
et. Independent samples. 
t ; 25 assume that the variates are independ- 


n 
1e., their covariance is zero. Then, 
10 

J 100 ο . f.010 .000 

Pn ; siz 

" 0 200 «000 «008 
πα digij η 
tions Sid =. 085, so that n = 76 after the itera- 


ή And πο = 66, 

Size iml We have the two independent sample 
Value ο "es 83 and 75, the joint and correlated 
hw Of 6 » and the joint and uncorrelated val- 
this mag These values may be interpreted in 
Ates with If we want to estimate the two var- 
from e Certain accuracies and confidence 
fke into 22me sample of individuals, we must 
àS ην Account the covariance of the variates 
Penge, 2 the fact that we do not have two inde- 
Price Samples, The equations tell us that the 
one of id for these circumstances is a sample 
at Samp}, If we are willing to take two differ- 
ae ποι 85, we can achieve the same desired 
ae One y, $8 and confidence with a sample size 

Penge iate of 83 and a sample size on the in- 


end 
die Other Sample of another 75 individuals on 
Ample 7, àriate, The advantage of using one 


ils ig At a saving of 75 + 83 - 97 = 61 individ- 
ito Te "ue - Had the variates actually been 
le : indivia. we would have saved 75 + 83 - 66 
gy tives, goals and still achieved the same ob- 
temple It might seem surprising that a small- 
the® Simp, Teduired for estimating both vari- 
Red is reg 2neously when they are uncorrelated 
eel, ee for estimating them separately. 
inte assu Θα, in one case that the variates 
iy the Oth med to be bivariate normal whereas 
vig er Case we were concerned with two 
ange mi p variate distributions. 
868 ο. be tempted to use 83 individuals 
ats. € both sample means from these stu- 
tan annot ever, if the correlation is . 64, then 
Sous 4 be 95% sure of achieving the simul- 
Ccuracies. If no correlation exists, 


MOONAN 287 


then we are more than 95% sure of achieving 
the desired accuracies. Thus our sampling 
scheme is costing more than necessary for the 
original specifications. 

There exists the reasonable notion that the 
sample size should be invariant to the signs of 
the covariances in the varcovar matrix. Direct 
application of the above formulae will not result 
in such an expectation when some of the covar- 
iances are negative, as well they might be in 
practical applications. To make the formulae 
work, certain adjustments must be made in the 
signs of the elements of dior Sij For instance, 
if S, = Sz, = -90 in the bivariate example, then 
the new diS!jd) is larger than the value given 
andthis results ina sample size of about 30. This 
value is not realistic for this problem. The 
trouble is stemming from the fact that we have 
made the elements of d! assume their absolute 
values. This makes no difference when all the 
covariances are positive. If S, and Sp}, are 
positive, we should expect the terms d!S!2g? 
and d^S?!d' to be positive because d! and d? are 
positive together or negative together ontheav- 
erage in random sampling. If S, and Sa, are 
negative, we should expect the terms 452244 
and d?S?!d! to be positive since S? and S? are 
negative and on the average either d! or d? is 
negative. Therefore, cross-product terms are 
subtracted out in the expansion of the quadratic 
form disNd). The sample sizes can be brought 
into agreement if we let d! - (2, -3) or (-2, 3) 
while allowing S8, = οι = -90. More simply we 
can allow Sy = Sz, = 90 whether or not they are 
490 or -90 and proceed with di - (2,3). 

In general, we can get the appropriate sample 
size by allowing all the elements of Sij to be pos- 
itive and also letting all elements of d! to be 
positive. Otherwise, under certain conditions, 
we can let elements of Sjj have their real signs 
and let the elements of di take on the signs of 
the elements of any row or column of 511. Since 
the signs of S,, and S, are both + and 8», and 
Sz, are both + when Sy. = Sg, = 90, we can let 
di - (2, 3) under either procedure. If we allow 
Sip = Sa, --90, S, is + and S, is -, and 8», 
is - and Sg is +, therefore we can use either 
di = (2, -3)ord!- (-2, 3). 

In this preliminary work, two types of var- 
covar have been found. These have been called 
covariately consistent and covariately inconsis- 
tent. A varcovar is said to be covariately con- 
sistent if the signs of the elements in every row 
or column, when multiplied either by +1 or -1 
are identically equal to the signs of any other , 
row or column. Zero elements are given an 
arbitrary sign. Any varcovar for which the above 
conditions cannot be fulfilled is said to be covar- 
lately inconsistent. A varcovar of order pXp 
has 2P-1 covariately consistent forms. Varco- 
vars of orders 1 X 1 and 2 x 2 are always covar- 


288 


iately consistent while only 4 forms oía3x3 
varcovar and 8 forms ofa 4 x4 varcovar are 
consistent. Consistency ina 3 x 3 varcovar 
would break down if S,,« 0, 5420 and S3,2 0. 
Such an arrangement as this has been called co- 
variately inconsistent because it is difficult to 
conceive of a self-contained real-variate sys- 


tem which exhibits a true mutual relationship 
Such as this, 


For a given 3 x 3 
the 4 consistent for 


erses of the incon- 
rical values of the 
ifferent. The rule 


istent Sample varco 
ers, in the section where he disc 
a. 


JOURNAL OF EXPERIMENTAL EDUCATION 


Var, see R, A. Fisher, 
usses the 


(Vol. XXII 


gested with some reluctance, but until some " 
theory develops it will have to suffice since ap 


parently inconsistent forms occur inpractice. 


3. Summary 


The use of repeated and independent tests of 
Significance on different variates from the ora 
Sample is known to be an unwarranted and ye 
Popular practice, In problems of estimation, 
estimates of parameters of different variates t 
are often made from the same sample withou 
regard to the dependence of these estimates. í 
This paper Shows how appropriate eei iud 
mean values of different variates can be ne ple 
using the same sample observations with asim 
random sample. An illustration was given d 
the bivariate case although the technique nee Í 
be restricted to the estimation of the means e^ 
two variates. The ideas contained in this pap 


Statistical Methods for Re- 


multiple regression analysis of certain 


NOTE ON DISPERSION ANALYSIS 


CHESTER HARRIS 
University of Wisconsin 


ο 
den P ^. F THE functions of dispersion an- 
p beds give a canonical representation of 
κόμα one or more statistically signif- 
tation it i nsions, In developing this represen- 
ulation d also possible to provide for the cal- 
dis cia the Mahalanobis’ D?, or generalized 
e ο Sune; for each pair of groups, and 
“Ach grou, r discriminant scores associated with 
5 ibt a Α number of persons have contrib- 
( View by fe lopment of these procedures; the 
ϑ Bive im Odges (1) and the paper by Tiedeman 
ey, ome Portant facts about the history of this 
toes t Uses of dispersion analysis in 
her Blater anne been illustrated recently in the 
con Sbster MES of neurotic groups (3) and 
aj Ment en 5). The purpose of this note is to 
sq Sis and t he matrices used in dispersion an- 
tà or κ Suggest a calculation routine that 
Du readily na may prefer. Since this routine 
} 367-70) e inferred from Rao's illustration 
i » it constitutes no new contribution to 
wa Siven eory, 
Care hi a on n variables for N persons 
bea ste od to s mutually exclusive groups, 
een gro B and W may be formed. B is the 
dre Ses of ii product-sum matrix, with 5-1 
Sduc "sü reedom, and W is the within groups 
E is matrix with N-s df. Itisthis 
G des; t appears in the equations below. 
ons, nate the n-by-s matrix of group 
Cane rows t is desired is to identify one or 
ing nical p of a matrix Y, where YG gives the 
Non the pro a esentation of the groups. In solv- 
“Sj €m it is conventional to assume W 
T, so that the determinantal equation: 


tay p IB.- Aw|-0 (1) 

e TeWritten as: 
πμ... (2) 
ην Purpose . 

TO of calculating the roots. The de- 


OWS 


8a 
Ç, 
à Signig; 


- Y are determined by solving, for 
ànt root, the homogeneous equation: 


à dt (Bw-1- xr - 0 (8) 


n A" 
les. πα dizing, The standardization process 
ht. is y eq EPlicit later. A test of signifi- 
ta οοῖς to determine the number of signif- 
spond; and ordinarily only the rows of Y 
‘ng to the significant roots are calcu- 


lated. The Rao-Slater and Webster papers may 
be referred to for numerical examples; the Rao- 
Slater analysis also appears in Rao (2). 

For the purpose of this development, let us 
make some simplifying assumptions, namely, 
that B and W both are non-singular and that the 
complete Y is to be determined. Since every 
row of Y satisfies (3), it is necessarily true that: 


YB - DAYW, (4) 


where DX designates the diagonal matrix of roots 
of (1) and is, by our assumption, non-singular. 
Post-multiplying (4) by YT, the conventional 
transpose of Y, gives: 


YBYT - DAYWYT. (5) 


Now B is symmetric, as is W, and consequently 
both sides of (5) must be symmetric. This im- 
plies that DX and YWYT, both of which are non- 
singular, must be commutative. Within the non- 
singular multiplicative group, scalars and diag- 
onals, and only scalars and diagonals, are com- 
mutative with any diagonal. It therefore is both 
necessary and sufficient, for (5) to hold, that 
YWYT be either a diagonal or a scalar matrix. 
The standardization process provides the added 
requirement that the elements of the principal 
diagonal of (1/df)YWYT each equal unity. There- 
fore we may write: 


(1/at) YWYT - I (6) 
and show YTY to be the inverse of (1/df)W, since: 
(/d)wyTy = I = (1/d)YTWw. () 


The problem of solving for Y is thus formulated 
as a problem of choosing the proper factors of 
the matrix (at)W71. 


Rao's transformation (2) on the elements of 
(1/df)W provides a matrix C, such that: 


(1/df)CWCT - 1 (8) 
and 
CTC = (dt)w-1 = yTy, (9) 
Therefore we may equate: 
CTQT -yT (10) 


990 JOURNAL OF EXPERIMENTAL EDUCATION 


with Q necessarily orthogonal, since: 
cTatec = yTy = cTc, (11) 


We have shown, then, that YT is merely an or- 
thogonal rotation of CT, and.consequently, CG 
is some orthogonal rotation of the canonical rep- 
resentation, YG, where as before G is the ma- 
trix of means of the groups. Our first conclus- 
ion therefore is that for a description of the s 
groups in n space, CG and YG are equally sat- 
isfactory. It is also true that the Mahalanobis’ 
D* values may be calculated directly from CG 
by Summing the squares of the differences for 
each pair of columns of σα. 
What has been done is to develop an arbitrary 
Set of factors of (dt)W-1 by using Rao's trans- 


an alternative to Wilks’ A criterion (Rao, 2; 372). 
Solve Sequentially for the characteristic vectors 
and roots of CBCT, As each 


of CBCT isa normalized characteristic vector 


It can be shown that the roots of (df)Bw-1 
given by this procedure, Using rA τ 


(dt)Bw-1 = ποτς, (12) 
Since Y satisfies (3) and is non-singular, 


YBCTCy-1- Dk (13) 


where Dk is the 
Ti 


roots of (1) multi lied by t 
Scalar (df). Then: M ý ies 
BCTCy-1. Y-lpy (14) 
and 
CBCT(cy-1) - (ου Ότι (15) 


which shows (CY-1) to 


be columns 2 
istic vectors of CBCT of character 


; Whose roots a 
Since CBCT is symmetric, (cy ο 


-1) may b 
88 an orthogonal matrix, Say, QT, eed τ. eei 
CTQT = cTcy-1 - YTyy-i. YT, (6 


as required. 


The restriction that B be taken as — 
lar may now be removed. With W non-singu fain 
both C and Υ are non-singular, and the comp 


Y exists in the sense of (9). We, therefore, per" 


mit Dk in (13) to be singular, that is, to have — 
one or more zero roots as entries in the prine: 
pal diagonal. Equation (15) is still valid how 
ever, since every symmetric matrix ΡΟ 
Singular or non-singular is orthogonally simi -- 
to a diagonal matrix of its latent roots. The “η 
ident modification is that those columns of (CY - 
Corresponding to zero roots would not be of pe 
terest, and consequently not be calculated. E 
equation (16) indicates, this does not interfer 
with determining the desired rows of Y. 


-sing^ 
To remove the restriction that W be non-sing 


ular is obviously much more difficult and cae 
ably not very useful, since one ordinarily ke be 
want the within groups product-sum matrix fos 
of rank equal to the number of variables. Ho ht 
ever, it seems likely that this restriction men 
be removed. Rao’s transformation is ed 
with singular matrices, yielding a singular the 
then the problem becomes one of solving for lar 
roots and characteristic vectors of the ΕΝ 
CBCT as before. The algebra would have to 


be modified to establish this, and it will not pe 
attempted here, 


Summary 


, The procedure suggested by this analysis 
might take these steps: 


a) Compute B, W, and G from the given pee 
b) Use Rao's transformation to form C, 58 
that (1/dt)CWCT = 1, 1 το” 
€) Compute CG. This gives an orthogona nta” 
tation of the complete canonical represe ade 
io; r 
in order to assist in viewing the configure 
tion of means; Thurstone ’s extended-ve 


Compute the Mahalanobis’ p? measures» ig^ 
desi: iih 


nificance for 
available 

9) Compute CBCT 
icance as 
ion, 


gnif“ 
er^ 


of 
8) Extract Characteristic vectors and r emm 


the symmetric matrix CBCT so long atnese 
esidual variation is significant. Use e 
Characteristic vectors with unit variano. the 


n C to Y and consequently CG 
nonical re resentati > 
g) Ifthe line T .... 


ar discriminant scores for A iv” 
ΕΤΟΙΡΕ are desired, the coefficients aT 
en by CTCG, since CTC = (af)w~1. 


: it 
The main advantage of this procedure is that } 


March, 1954) 


te configuration of group means at a τε]- 

elatis early stage and with relatively little cal- 

in Wn. this may be an important consideration 
ng explorations of new data. 


REFERENCES 


Lan 
Odges, Joseph L., Jr. Discriminatory An- 


alysis: : 
lysis: I. Survey of Discriminatory Analysis. 


{Randolph Field, Texas: USAF School of Av- 
: ©n Medicine, October 1950). 
* Ra 

Me C. Radhakrishna. Advanced Statistical 


Tods ir in Biometric Research (New York: 
Wiley and Sons, 1952). 


HARRIS i 291 


3. Rao, C. R., and Patrick Slater. «Multivar- 
iate Analysis Applied to Differences Between 
Neurotic Groups, " British Journal of Psy- 
chology, Statistics Section, II (1949), pp. 
17-29. 


4. Tiedeman, David V. ‘‘The Utility of the Dis- 
criminant Function in Psychological and 
Guidance Investigations, '' Harvard Educa- 
tional Review, XXI (Spring, 1951), pp. 71- 
80. 


5. Webster, Harold. ‘‘Rao’s Multiple Discrim- 
inant Technique Applied to Three TAT Var- 
iables, " Journal of Abnormal and Social 
Psychology, XLVI (July 1952), pp. 641- 
648. 


CONSTANCY OF RORSCHACH COLOR RE- 
SPONSES UNDER EDUCATIONAL 
CONDITIONING 


JANET E. BLECHNER 
Berkeley City Schools 
Berkeley, California 


I 
ση 


IN 
experimen REVIOUS article describing an 
ovement rea the constancy of the Ror schach 
Ron conce responses, the writer outlined com- 
Mitac cr ots concerning the role of the 
itesseq that Clinical usage. The point was 
tre rom ap the Personality descriptions result- 
treated as y administration of the test were 
sh of the presenting basic personality struc- 
dee 'Omena Bei ce upon which daily experimental 
sar strea ade somewhat narrow ripples on a 
d of the pase Because of the ever-increasing 
(Sco es Gece device in this manner it 
hy ermine Ae ever necessary to attempt 
tig 9TS upon tt intrinsic validity of the scoring 

MS are based ich these personality descrip- 


u 
The Problem 


àn The experi 
h ment herein described constituted 


A πρό, the original test of constancy of 

Cp Srtain Scores. The writer attempted to 

tions, C) wou Whether the color responses (FC + 
"E desi d increase with educational c o ndi- 
Sned to influence such an increase. 


W 
et 
hoq and Procedure 


atio 0 ex 

Qi x οἳ students in the beginning edu- 
tes rnia αν ology course at the University of 
tly Bhateg ^- Berkeley in the fall of 1951 were 


d ; 
ἃς ΣῊ Subjects for the experiment. One 


fone Control the experimental group, another 


o 
l group. Case numbers were as 


E 3 
onerimental group 83 
Tol group 96 


Oth c 
DN fees took the group version of the 
Th t 8hsuin ice, with an interval of about a 
Bro, fest rece, between the two administrations. 
Tog Vere in utilized for the experimental 

at the 105e only of students whose atten- 

Conditioning session was attested 
a nCe record. This record con- 
"ap Pbi = blank for the Ishihara test of 
Purpose SS, with the record serving the 

9f eliminating from the study those 


who were color-blind, as well as those who were 
not present. 

Administration of the test and method of r e - 

cording responses were performed a c cording to 
the technique outlined by Harrower-Erickson. 
The only variation of the method occurred in the 
administration of the repeat test, when the sub- 
jects were instructed neither to attempt to re- 
member specifically what their earlier responses 
had been nor to try deliberately to give responses 
different from the previous ones. They were en- 
couraged to try to respond to the slides as if they 
had never seen them before, recording what they 
now Saw. 

Scoring, by the Klopfer method, was perform- 
ed by the examiner for the original test andfor 
the second by a clinical psychologist having no 
interest in nor connection with the study whatso- 
ever. This scorer also made a spotcheck per- 
usal of the examiner’s scoring, the aim being to 
secure as high a degree of objectivity in scoring 
as possible. 

For the conditioning procedure two different 
devices were utilized. The first consisted of 
lecture material in which the instructor discussed 
visual color-perception theories and the signifi- 
cance of color associations in the psychology of 
perception. He mentioned also, for example, the 
psychological origins of acceptance of certain 
colors as cooland others as warm. 

In addition to the lecture material, the instruc- 
tor presented a set of slides which had been con- 
structed by the writer. These were composed of 
brightly and variously colored bits of odd-shaped 
gelatin papers, some of them shapedin a way that 
would suggest an association w ith the color. A 
green bit of paper, for example, might be shaped 
like grass, with the objective that the student 
should be led to the association, ‘‘green grass." 
The slides were explained as an attempt at con- 
struction of a new test, and the students were 
urged to assist the project by listing as many of 
the paired form and color associations as possible. 
Actually, it was believedthatthis constituted prac- 
tice in responding to color, and the question was 
whether or not the conditioning would result in 
heightened awareness of the color on the second 
Rorschach test. For the control group, of course, 
there was no conditioning activity of any kind. In 
both groups there was, as well, no mention ofthe 
Rorschach test except during the test administra- 


294 


JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE I 


MEANS AND STANDARD DEVIATIONS OF 
RESPONSES ON TEST AND RE-TEST USING THE 
GROUP RORSCHACH 


NUMBERS OF 


(Data from Two Groups of College Students) 


ο ροκ 


Group II 


N = 83 N= 96 
i eee 


Test I 17.25 

Test II 22.64 

Gain 5.39 
Standard Deviations 

Test I 6.26 

Test II 19.45 

Gain 4 


TABLE Ij 
RESPONSES IN THE COLOR C 


ΟΕ RESPONSES 
(Data from Two Group; 


N = 83 

Means 

Test I 1.78 

Test II 3.73 

Gain 1.95 
Standard Deviations 

Test I 1.72 

Test II 2.52 
Critica] Ratio 7.16 


ATEGO 
D RE-TEST, EXPRESSED AS AC 


18.58 
22.31 


3. 73 


RY UPON TEST 
AL NUMBER 


S of College Students) 


Mim — 


Group II 
N = 96 


(Vol. X 


XU. March, 1954) 


tion itself, 
W Results 
In 
ease groups, there was an expected in- 
Second ως the total number of responses to the 
est, but the increase tended to consist 


Tgel; 
are Y of form-dominated responses to detail 


Erickson Phenomenon noted also by Har rower- 
trou maga Rer original standardization of the 
6 nu e of administration. 

e Second i er of color responses increased in 
the test by a mean of 1. 95 responses for 


e » 
Broyp menta] group and 1.11 for the control 


Despi 
the ha any apparently greater increase in 
eve à erg toned experimental group, how- 
Means „ical ratios of the difference in the 
for o, P9ints to the significance of the increase 
clusion tE oups, This evidence suggests the con- 
be fo at an increase in color responses is to 
ML iB e in repeat administration of the test, 
the accom, ence does not point conclusively to 
we Condi jBlishment of this result by virtue of 
οὐ] ave pee Were this to be assumed, it 
nt increa een necessary to discover a signifi-_ 
ty he Conte, in the conditioned group but none 
tupon ol group. It may be theorized, then, 
repetition of the test there is a relax- 


BLECHNER 295 


ation by the subjects of their intellectual control 
and greater release of the affectivity which the 
color responses are presumed to indicate, or that 
familiarity with the test leads to greater response 
to the color. It must be assumed, therefore, 
that the color factor on the Rorschach does not 
refer to a stable personality element but to one 
which can be affected in as yet unknown ways. 


V Summary and Conclusions 


Two classes of beginning educational psychol- 
ogy students at the University of California were 
utilized as experimental and control groups for 
an experiment to test the influence of education- 
al conditioning on the color responses to the Ror- 
schach test. Both groups took the Rorschach 
twice at an interval of about one month. Prior 
to the repeat test in the experimental group the 
instructor presented lecture material, the Ishi- 
hara test of color-blindness, and examiner- 
constructed slides designed to increase the sub- 
jects’ experience at formulation of color-domin- 
ated concepts. Results showed a significant in- 
crease in number of color responses in both 
groups, but the increase could not be ascribed 
to the conditioning, since it appeared in both 
groups. Intrinsic validity of the color factor on 
the Rorschach, therefore, is open to question. 


"PSYCHOLOGICAL" CORRECTION 
FOR CHANCE 


JULIAN C. STANLEY * 
University of Wisconsin 


© 
ings CRREC TING TEST scores for ‘‘guess- 


ing ώμος, to correcting them for differ- 
testees a of unanswered items among the 
number ofi σο, the standard deviation of the 
Person's za mS omitted, is zero, then each 
lection ES Score is unaffected by the usual cor- 
testee Da. Chance success. Therefore, if each 
às even VES blank the same number of.items 
be Y other testee, the correlation rpg Þe- 


een ‘ipi 
Stores Tights” scores (R) and corrected 


"UT Ws I 
ege ge 


vill b 
«^ be 
Wrongs" 00. Here W is the individual’s 
Optio Score and c the number of choices 
ns = = 


ny tia the cn Item has, 

get all Correction for chance is not needed 
in y pores answer all items was shown in 
165 Places Zinger (5) and later re-stated in var- 
aid. Peri E 2, p. 271; 1; 3, p.70; 6, p. 
thee 18 sy PS the most adequate brief discus- 
hat; Gulliksen (4, pp. 245-251), who states 


MD: " 

any * p is no reason for considering 

formula ese [corrections for chance 

* Wi. if, for most of the people, R 
ἘΝ isissen 

hünp, PSSentially equal to the total 

ulas το items in the test. Such form- 

dumber fo be used if, and only if, the 
TBe rd unmarked items. ... is fairly 

Smal P" Some persons, and fairly 

9r others, (p. 248) 


In 
Sen, the δ 
p 8 pr owing paragraphs the writer pre- 
Cha Ore is 9f that when og=0, each testee’s 
atig >i unchanged by the correction for 

n , "Other words, for this special situ- 

(R- Ww , =Zp. He also suggests 

t vr? 


leg, Ve 
tg fron ro ἃ correction for chance is use- 
Srthej "he Statistical standpoint, it may 
dur € warranted by a certain attitude- 
m e ed On college students, once the log- 
. rection formula is understood by 


Th 


ro 


may be written 


e 
formula sig- w 
The c-1 


S= (Gp B- 2-0 , where n is the total 
number of items on the test (n = R + W + O for 
each testee). If O is the same for each exam- 
inee, S is a linear function of R, so rpg = 1. 
Thus when og is approximately zero, it appears 
to be a waste of time to correct raw scores for 
chance, since the correction will not alter any 
standard scores. 

This can be proved rigorously by starting 
with the basic formula for a z-score, making 
the correction for chance in the general case 
where 0g $ 0, simplifying the ensuing formula, 
and then showing that when O does not vary from 
testee to testee the formula reduces to the orig- 
inalz uncorrected for chance. Begin with the 
usual formula, 


where R is the mean ‘‘rights’’ score. Then for 
any one of N testees, 


N Ww 
Hg -Hr ev? 
z'= - x (1) 
R- _W_) 
e-l 


It can be shown rather easily that the mean 
corrected for chance, 


N w 
Z(R- ——) - 
1 6-1 ,equais (=) R- Ll α-Ὁ) 


d c 


or SR + oc n . Also, the standard deviation 

corrected for chance, ?(R - wW y equals 
gel 

zH Vc*s* + 0% + Βοτηρσηορ - Substituting 

these values for the corrected mean and cor- 


rected standard deviation in formula (1), we 
obtain after further simplifications 


"T c(R - R) + (0 - 0) Q) 
* Ke208, + 02. 4 ᾱς-..-σ-σ- 
C205, + 0 + 2οτῃοσῃσρ 


ο 


aut 
πον is indebted to William Je McIlrath for bibliographie assistance. 


998 : JOURNAL OF EXPERIMENTAL EDUCATION 


If all O's are zero or, more generally, if 
σος 0, formula (2) becomes 


.ι- επ -Ἑ . R-R ., 
vezon OR 


While there is no psychometric necessity for 
the chance correction when the number of omits 
varies little from testee to testee, there may 
be an excellent psychological reason. This is 
best illustrated by the true-false or two-option 
multiple choice test, where c = 2. Those exam- 
inees who are totally ignorant concerning the 
test material and who make sheer guesses on 
all n items will tend to “earn” scores of n/2 
unless a correction Íor guessing is utilized. The 
Average rights score of these uniformed testees 
will be about 50% of the total possible score, 
though their actual knowledge is, by definition, 
0%. When omits are negligible, correcting all 


Scores for chance raises their standard devia- 
tion from 


σῃ to ED OR. 


Thus the increase ranges from 10005 for two- 
option items t: five options, 


Also, the mean drops from R to (cR 


à » even when from a 
Strict measurement Standpoint it is useless, If 


(Vol. XXIL 


there are very few omits, a relatively simple 
Correction formula is 


c S 
Sei URL 


If R - n, thenS =n. IE R-n-1, then8-n-7 


c J IfR=n-2, then 
CET 


€... € 
S= (n E e-l reati 

This constant subtractive factor of c/(c -uE 

ly simplifies the process of finding n ma” 

Scores, especially if an electric calcula 

Chine is available. 


REFERENCES 


1. Casanova, Teobaldo. *'Analysis of "e ο 
fect upon the Reliability Coefficient ἝΝ ia 
Changes in Variables Involved in at of EX 
mation of Test Reliability, ’ Journa 


1 
perimental Education, IX (March 194 h 
219-228, 


2. Freeman, F., N. Mental Tests (Boston: 
Houghton Mifflin Co,, 1926). an Be 
3. Greene, E, p. Measurements of Hun)” 
havior (New York: Odyssey Press, 
4. Gulliksen, Harold. Theory of Men 
(New York: John Wiley and Sons, i le Re“ , 
5. Holzinger, K. J, **On Scoring Multipl ps 
Sponse Tests, ” Journal of Educa. 
chology, XV (October 1924), 445- uca” 
6. Remmers, H, τ and Gage, N. L. E: 


Á ion (Ν 9 δ΄ 
tional Measurement and Evaluatio 
York: Harper and Brothers, 1943). 


ΕΒΒΑΤΑ 


c author, Clifford M. Christensen, wishes to correct three errors that appeared 
m is article, ‘‘Multivariate Statistical Analysis of Differences Between Pre-pro- 
sional Groups of College Students, ” in the March 1953 issue of the Journal of 


Experimental Education, pp. 221-232: 
Ee p. 223, formula (3) should read: 


wes {n- 1/2(p+ q+ Y loge V 
2. p. 229, formula (8) should read: 
Ly 21,X, + 12X2 + 13X; + 14X4- ΤῊ 14Xij + loge Tj 
where 1j = {aih} (Xi) 
. 230, constant terms in Table X should read: 


4 
- 1/2 Z, ljXj; + loge ΤΙ 
/ i d^ Be ttj 
OK OK KK ROKK 
erimental Education, on page 150 of the 


cale," by Ralph Mason Dreger, the line— 
evel—should have been included under 


uns December, 1953, Journal of E 
*B x “A Simple Course Evaluation 8 
oth ratios are significant at the 1% 1 
able 1, 


ΙΓ Journal of Experimental Education 


Volume XXII 


June, 1954 


Number 4 


SOME EFFECTS OF PROMOTION AND NON- 
PROMOTION UPON THE SOCIAL AND PER- 
SONAL ADJUSTMENT OF CHILDREN 


JOHN I. GOODLAD 
Emory University* 
Emory, Georgia 


SECTION I 


THE PROBLEM 


p 
"pose of the Stud 


ting PURPOSE of this study is to deter- 
Persona] ον or not differences in social and 
9f pron adjustment exist between two groups 
tempt ae and nonpromoted children. An at- 

ifferg ill be made to evaluate the nature of any 
of oces found and to explain the relationship 
and non, differences to the factors of promotion 
tions Promotion. This should suggest implica- 
iticatio, T educational policy governing the class- 
a reso? Of pupils. Finally, it is hoped that, as 
tions t of the findings, specific recommenda- 
Subsequey be made for the grade placement and 
Birls Lm instruction of slow-learning boys and 
n the elementary school. 


Anp 
Valuation of Promotion and Nonpromotion 
Pr ; 
di 9motion practices and achievement. — 


Studies 
that ^58 into the achievement of repeaters indicate 
like a) ο. Children do no better than children of 
Beste lity who are promoted. This was sug- 
“eported. Keyes nearly forty years ago when he 
T'épea t, that only 21 percent of a large group of 
t ies did better after repeating a grade 
Ot coup © 39 percent actually did worse (24). 
thes ne e, it is impossible to estimate how well 
been Ë me children might have done had they 
Stig, Noted. Arthur sought to answer this 
With 4» When she matched a group of repeaters 
ien SToup of non-repeaters on the basis of 
Sern age and discovered that the former 
Beriga 9) Ὁ more than the latter over a two-year 
ter, t ). She put forward the thought, how - 
Shtion r t failure to eliminate the causes of re- 
ather than the repeating experience it- 


self may have been the more potent factor in de- 
termining subsequent achievement. The cause- 
and-effect relationship of a given factor can be 
clarified only by holding constant other factors 
likely to be influential. Klene and Branson took 
cognizance of this fact when they equated child- 
ren, all of whom were to be retained inthe grade, 
on the basis of chronologicalage, mental age, 
and sex (25). Half were then promoted and 
half retained in the present grade. They con- 
cluded that, on the whole, potential repeaters 
profited more from promotion than did the re- 
peaters from nonpromotion so far as achieve- 
ment is concerned. In this connection, Cheyney 
and Boyer observed that lack of readiness for 
the work of a given grade is largely due to a 
slow learning rate, which will not be improved 
by repeating a grade section (9). Saunders 
summed up an extensive survey of studies into 
the effects of nonpromotion upon school achieve- 
ment as follows: 


....it may be concluded that nonpro- 
motion of pupils in elementary schools 
in order to assure mastery of subject 
matter does not often accomplish its ob- 
jective. Children do not appear tolearn 
more by repeating a grade but experience 
less growth in subject-matter achieve- 
ment than they do when promoted. There- 
fore a practice of nonpromotion because 
a pupil does not learn sufficient subject 
matter in the course of a school year, or 
for the purpose of learning subject mat- 
ter, is not justifiable (37), 


Promotion practices and homogeneous group- 


ing. —For most teachers, to secure a class of 
children closely approximating one another in 
all areas of development would be the realiza- 
tion of a teaching Utopia. However, Keliher (23) 


302 JOURNAL OF EXPERIMENTAL EDUCATION 


i the social desirability and Elsbree 
Keeper feasibility of obtaining any such con- 
dition of general homogeneity (12). Burr points 
out that when groups are made non-overlapping 
in achievement for one subject, or even for a 
phase of a subject, they overlap greatly in other 
subjects or other phases of the same s ubject 
(6). From a study of 46 schools with varying 
rates of slow progress, Caswell (7) concluded 
that variability in achievement is no less for 
schools with high rates of nonpromotion than 


€ous grouping be desirable or attainable, non- 
promotion does not appear to reduce the range 


of specific abilities with Which the teacher has 
to cope. 


Promotion practices and habits and attitudes, 
— "Viele, arguing for the abandonment of no-fai 


of reasoning. Both Hurlock (21) and Gilchrist 
(20) found that gr 


praised for their 


Pupils, indicated 


hool work (36). 
Of course, these children might have felt the 


Same had they been promoted; Sandi 


(22). Research studies Conducted by McElwee 
(28) and by Sandin (38) revealed 


i k a greater inci- 
dence of behavior considered troubleso me to 
teachers among retarded Children than among 


regular-progress pupils, Although these find- 
ings favor promotion over nonpromotion, further 
experiments with carefully Controlled Situations 


(Vol. XXII 


need to be conducted, . 

Promotion practices and personal-social ad- 
justment. — The area involving personal-social 
adjustment as affected by promotion practices 
probably is most barren of research. A study 
by Farley, Frey, and Garland indicated a sig- 
nificant correlation between retardation and a 
low score on a five-point character-rating scale 
but left open the question as to whether this was 
à cause or an effect relationship (16). Anfinson 
Sought to determine the nature of this νον. 
Ship by setting up controls (2). He matched 11 
pairs of junior high school pupils on the basis 
of school attendance, chronological age, sex, 
intelligence, and Socio-economic status, one 
member of each pair having been promoted reg 
ularly and the other having repeated some pre 
vious grade. His findings showed a significan 
advantage for non-repeaters over repeaters in 
Social and personal adjustment as revealed by 
the Symonds-Block Student Questionnaire. AS 
Anfinson pointed out, it would have been pene 
to have tested these irregular-progress pupils 
soon after the failure occurred; in some cases 
Several years had elapsed. Ina ddition, the 
range of measuring techniques was very limit 
ed. The results of such questionnaires, when 
used without other Sorts of evidence, must be 
handled with considerable reservation. 


Sandin used s 
Check lists, obs 


Study aspects of social and personal adjustment 
(36). In Seneral, he found that nonpromote 
Children tended " 
grades higher than their own, to be pointed ou 
by classma 

older pupil 
the selecti 
finding did 
Where nonp: 
cantly mor 
His finding, is 
described previous] disclosed a general 0 
look indicative of 4 a happy adjustmentamoné 
mong normal-progress te 
made no attempt to equa 
other factors likely to af- 
nal adjustment, it is «d 
Contributing influence ea 
Sandin put his finger € 


It is necessary to conduct fur- 
ther study to discover to what extent 
Children who might have been non-pro 
moted accor ding to grade standards; 
but who actually were promoted, show 
a better picture of adjustment than 
those who were held back (36). 


June, 1954) 


Contribution of the Present Study 


8 The present investigation takes over where 
ee left off. Sandin raised the question as 
am ier his findings would have been any 
ema even if his groups of 400 nonpromoted 
Children had been promoted regularly. While it 
1S impossible to know what would have happened 
re these particular children, it is possible to 
ems appropriate conclusions from the exper- 
áp En of similar children under corresponding 
tin itions. This study, then, makes its con- 
Bos cU to existing research by comparing the 
ος ut and personal adjustment of equated groups 
moti ildren Subsequent to promotion and nonpro- 
din" The following thesis statement gave 
cle to the investigation: During the year 
fere wing promotion or nonpromotion, what dif- 
5 was in personal and social adjustment may 
moteg guished among promoted and nonpro- 
age Children of corresponding chronological 
leva PRA age, and scholastic achievement 
μα of the thesis suggests two sorts of 
da ences and three levels of treatment for 
b Collected, The first differences are those 
err the two groups, regardless of similar- 
vidual oe Children of both groups and of indi- 
he oth ifferences in reverse of general trend. 
the - er differences are those existing among 
Concent of any one group. This study is 
isnt Mn with both types of differences. The 
ώρα of treating data is to reveal the ex- 
ence € or non-existence of either type of differ- 
alysj, Such treatment constitutes adequate an- 
tor. S Of the thesis; however, the value of the 
tation Y is enhanced by qualitative interpre- 
furthe of differences. Consequently, this study 
ges T Seeks to determine the relative advan- 
ifieg or disadvantages of any differences ident- 
and to evaluate, tentatively, the roles of 


To; " 
diffe ction and nonpromotion in producing these 
erences, 


p 
“22 of Approach 


cuit wpproaching this problem, a major diffi- 
all the 4S encountered in attempting to control 
able ος Many factors in the school situation cap- 
Pment Influencing the social and personal devel- 
Scho, |. Of Children. It was impossible to match 


au but the assrooms, teachers, and pupils on 


tion, € factors of promotion and nonpromo- 
Select the thermore, since it was necessary to 
Perm; he children early enough in the year to 


Perso, Some preliminary evaluation of their 
Tent igs ity growth, the time factor was a deter- 
re any very extensive matching procedure. 
Drop. OUS to initiating the study, the writer 
dren Sed the following plan for choosing the chil- 
9 be studied: 


GOODLAD 303 


1. Select a single school system, the schools 
of which represent a wide range of promotion 
rates. Thus, the need to equate school systems 
is eliminated. 

2. Early in the school year, select from five 
or more schools having relatively high nonpro- 
motion rates a total of at least fifty nonpromot- 
ed first-grade children. 

3. Equate this group with a group of second- 
grade children chosen from schools whose pat- 
rons represent comparable levels of socio-ec- 
onomic status. The factors to be equated are 
mental age, chronological age, and achievement, 
since significant differences between the groups 
in any one of these obviously would invalidate 
the findings. 

4. Check the progress records and health 
cards of all children selected. Eliminatefrom 
the groups all those suffering from physical 
handicaps, such as speech or eye defects, and 
all those obviously suffering from severe per- 
sonality disorders. 

It is important to attack the promotion prob- 
lem as early as possible in the school life of the 
child. A boy or girl who fails the sixth grade, 
for example, probably has experienced the ef- 
fects of several previous near-failures, For 
him, repeating that grade represents the repe- 
tition of only a fraction of his total schooling. 
For the first-grader, failure of the grade is not 
a cumulative experience. If this failing exper- 
ience is damaging to him, the damage should be 
revealed in his next year's development, while 
he is still at a young and impressionable age. 
Failure at the end of the first year represents 
the failure of one hundred percent of a school 
career measured in grades. 

Implementing the plan establishes two groups 
of children selected from ten or more schools 
in one schoolsystem. The first group of fifty 
or more boys and girls is composed of nonpro- 
moted first graders, the second of promoted 
second-graders, both groups equated for mental 
age, chronological age, and achievement, and 
all children spending their second year inschool. 

Two major hypotheses, tested as null hypoth- 
eses, are set forth for investigation: 

1. There are no differences in social adjust- 
ment between the repeating and non-repeating 
groups of children. 

2. There are no differences in personal ad- 
justment between the repeating and non-repeat- 
ing groups of children, 

Since any division of the personality can be 
only arbitrary, it is desirable to relate the 
component parts wherever possible so that a 
picture of wholeness may be conceived. 

It is important that a child get along with 
his peers, that his teacher accept him at a 
level common to the other children, and that 
he be sure within himself that his status is 


904 


i to that of his classmates. Evaluat- 
Ὃ ο Lidia calls for the utilization of 
instruments which give the children opportunity 
to reveal how they feel about one another, in- 
struments which reveal how the teacher feels 
about each child, and instruments which reveal 
how the child feels about himself. It is not nec- 
essary to compare the results with any pre-de- 
termined standards of normalcy but simply to 
compare one group with the other and to deter- 
mine whether or not significant difference exists, 
Identifying differences is tantamount to accept- 
ing or rejecting the hypotheses. In the eve nt 
that the null hypotheses are rejected, it is im- 
portant to determine the group favored by the 
differences and to formulate some assumptions 
to account for the presence of these differences. 


SECTION II 


EXPERIMENTAL DESIGN OF THE STUDY 


The Selection of Grou 5 
ection of Groups 


nonpromoted first-graders, All 

repeating the grade for the seco i 

had severe physical handicaps, ware dis i 

Seventy-three nonpromoted children, represent- 

ing twelve first-grade classrooms, remained 
Each of the first-grade teachers in these five 

Schools was asked to name the te 


n Childre = 
moted to grade two the previous June ολα 


those who were 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. XXII 


readiness for this grade she was most doubtful. 
Approximately 150 children were selected in 
this manner. Thus, had the nonpromotion ve 
of these classrooms been on a par with those 0 
the nonpromoted group a considerable propor 
tion of these 150 boys and girls would have re 
peated grade one. From this number, those 
who had spent two years in grade one and HOSE 
with obvious physical handicaps were elimina 
ed. Those remaining were tested to equate a 
group of them with the nonpromoted group m 
the basis of chronological age, mental age, an 
achievement. 

Equating the groups. —Kuhlmann-Anderson 
Tests (Grade I, Special) were administered m 
all children for the purpose of securing the men 
alages. To obtain the achievement quotient, 
the Metropolitan Achievement Tests (Form R, 
Primary I Battery for grade one and Primary 
U Battery for grade two) were used. and 

Given the chronological age, mental age, 3 
achievement quotients of the preliminary eo 
of promoted and nonpromoted children, the $ 
lection of final equated groups was largely 8 aie 
process of trial-and-error, Because of the pied 
ative homogeneity in chronological age even τ 
fore the actual equating took place, it was pe 
sible to concentrate on securing approximate ὅ 
comparable distributions for the other factor ses 
It was necessary to maintain several extra cast 
in each group to allow for drop-outs and trans 
fers so that two adequate samples remained 
June 1948, when all data had been collected. 


Sele 


ction of Evaluation Instruments 
Collecting the data, —The evaluation ma 


ments, their purposes and application, are". 
marized in Table 1, ; it" 
Discussion of validity and reliability 15 fol” 
ted here because of space limitations. The tion 
lowing points are pertinent to any considera 


B jo' 
of appropriateness of the instruments to the J 
to be done: 


t 
1. The Kuhlmann-Anderson Tests are at "n 
S of the paper-and-pen^. 
lligence at this age poit 
4 Metropolitan Achievement Tests an^ 
found to be well Suited to the curriculum 0168 
he schools selected at the time t 
Study was Conducted, 


3. The fact that ctions of 
the California Test pd of the subse "IB" 


type to measure inte 


4. The Haggerty-Olson-Wickman Behavior 


Rating Schedul, "* 
: H Re 
TVations of boys and girls. P 


teachers: obse n7 
Sults of their use Provide valuable insights ! 


J 
une, 1954) GOODLAD 


Purpose Instrument Application 


A. To Equate Groups 
1. For Mental Age Kuhlmann-Anderson 
Intelligence Tests 
Grade I (Special) Both Groups 


2. For Achievement Metropolitan 
X Achievement Test, 
Form R 
Primary I Battery Nonpromoted Group 
Primary II Battery Promoted Group 


B. To Evaluate Adjust- 
ment 


1. Self-Rating California Test of 

Personality, Primary 

Series Both Groups 
2. Peer-Rating Sociometric Questions Both Groups 
3. Teacher-Rating Haggerty -Olson- 


Wiskman Behavior 
Rating Schedules 
Schedule A 
Schedule B (Div. III 
and Div. IV) Both Groups 


L 
TABLE I 
EVALUATION INSTRUMENTS 
TABLE Π 


THE HAGGERTY-OLSON-WICKMAN BEHAVIOR RATING SCHEDULES 
DISTRIBUTION OF CHILDREN’S SCORES ON SCHEDULE A 


| 


Range Promoted Group Nonpromoted Group 
109.5-119.5 1 0 
99. 5-109. 5 0 0 
89.5- 99.5 0 0 
79.5- 89.5 0 1 
69.5- 79.5 2 3 
59.5- 69.5 3 6 
49.5- 59.5 0 5 
39.5- 49.5 3 3 
29.5- 39.5 12 8 
19.5- 29.5 τ 6 
9.5- 19.5 13 13 
0.0- 9.5 14 10 
oo 
Number 55 55 
Mean 26.0 32.0 
S.D. 22.2 23.13 


VEN LLL LL LL LL Mili 


305 


306 


to children's social adjustment and into the 
emotional areas of self-adjustment. 

5. To secure peer-rating data, each promot- 
ed and nonpromoted child was asked to name: 

a. The three children in his ownclass- 
room whom he would like to have as 
his very best friends. 

b. Three children in his own classroom 
whom he would not care to have as best 
friends. 


With a major purpose of this Study being to 
determine how well promoted and nonpromoted 
children fit into the groups of which they are 
members, this sociometric technique yielded 
very valuable evidence, 

By the end of the School year, the following 
data were available for analysis: 


i. Results of the California Test of Person- 
ality, administered to each child first in Octo- 
ber or November and again in May. 


repeated in May. 


3. Ratings on the Haggerty-Olson-Wickman 


Behavior Rating Schedules Secured of each child 
in May 1948, 


Summarizing the data, — The data having been 
collected izati 


total weighted Scores, b 
Sections. Means and 5 


ify friendship patterns, all instances of socio- 
metric mutual acceptance were tabulated, Fin- 
ally, the scores for each child on al} instru- 
ments were checked in order to See if any pat- 
terns of consistently low or high Scores per- 
tained. 


Analyzing the data. — The following Steps 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. XXII 


were followed in analyzing the data: 

1. Examination and statistical treatment of 
the summaries described above to see if any 
major group differences could be identified. 

2. Examination and statistical treatment of 
the distributions on single items to see what, 
if any, items discriminated significantly between 
the groups. da 

3. Synthesis of all significant results to i 
dentify any patterns of group differences. i 

4. Acceptance or rejection of the two null 
hypotheses relative to social and personal ad 
justment. 

5. Qualitative comparison of differences to 
determine in what areas and to what extent one 
group might be considered better adjusted than 
the other. à 

6. Further probing into the nature of ia emp 
differences to determine if logical explanatio 
for their presence could be found. indi- 

7. Formulation of recommendations as in 
cated by the findings. 


SECTION ΠῚ 
ANALYSIS OF THE SELF-RATING DATA 


California Test Findings 
——— est Findings 


Striking similarities in both initial and iral 
performances of the two groups on the Calif tal 
nia Test of Personality were revealed. In = n 
Adjustment, both groups made identical m € vd 
gains of 0.9, the Promoted group moving aon - 
^ mean of 61.9 to one of 62.8, while the nO 
Promoted group shifted from 60.7 to 61 .6 UN 
downward trend occurred only in Self-Adjus 


respectively, t 
The shifts in mean raw scores from aT ol 
to Second testings Slightly favored the promot? 
eliance, Freedom from Wi u 
DCies, Freedom from Nerv? 


“Social Tendencies. PY". 
€ shift was to the advantag® 
of the nonpromoted group on Sense of Perso 
Worth, Sense or Personal Freedom, Feeling ^ 
Belonging, Family Relations, School Relatio, 
Relations, In one instance; “in, 


6. 
Onstant while that of the e me 
in another, the advantag n that 
n of one dropping less tha 


dropped Slightly; 
About by the mea; 


—- 


June, 1954) 


of the other. 

The California Test of Personality is so con- 
iueted that a high score is considered indica- 
us of a high level of adjustment. Such being 

9 Case, mean raw score differences favored 
ee group on Total Adjustment, were slight- 
A to the advantage of the promoted group on 

e two subsections, Self and Social Adjust- 
ment, and favored the groups almost alternately 
on the twelve component categories. A vital 
Pestin arose at this point: With what degree 
as confidence may these differences be accepted 
hae i Ones, not resulting simply by chance? 
findi. 15, what assurance is there that these 
ex adr Would not be reversed with repeated 
euh riments of a similar nature? The signifi- 
κας. of the differences between the groups was 
Yer by the regression technique described by 

ers and Van Voorhis (34). 


Statics; 
Statistical Significance of the Findings 


= Will be recalled that the children in the 
With thee group were not matched, case for case, 

à hose in the nonpromoted group. Rather, 
equi PUES were equated—that is, arranged in 
diner È distributions —for chronological age, 
technic 486, and achievement. The regression 
OF gr ique provides for a hypothetical matching 
ing άρα, thus obviating the necessity of hav- 
r tees Tecisely matched pairs. The theoretical 
actual Sion of one group is predicted from the 
Using p ^ Bression of the other. To illustrate, 
Erga e promoted group as the control, the re~- 
Scor lon of end scores (May testing) upon initial 

i 55 (October-November) was deter mined. 
Bron 9n the assumption that the experimental 
Same (nonpromoted children) developed in the 
Score 4shion as the control group, the end 
fro, ΟΥ each nonpromoted child was predicted 
S initial score. The formula is 


me 
Xi = r A Q6) + (M; -r Μι) 


Whi 
Score X; and Xi are the initial and predicted 
S, r is the correlation coefficient obtained 
pr relating initial scores with end scores for 
fing] oMoted group, M, and Μο are initial and 
the mes respectively, and σι and 0z are 
entire 18] and final standard deviations of the 
the e, Pfomoted group. Then, by comparing 
ed chi) Scores actually obtained by nonpromot- 
the si dren with the scores predicted for them, 
“ateg nificance of group differences was indi- 
Tot.» Hlustrated below. 


then al adjustment scores. —This formula was 


iti aud ee to the California Test scores. In- 
Promot end scores on Total Adjustment for the 
he a, d group correlated .583. Substituting 


Propriate means and standard deviations 


Cor 


GOODLAD 307 


in the formula produced the following: 


12.4 


$2.4 2, 
1053 Xi + (62.8 - .563 61.9) 


10.3 


Xi-.583 


Simplifying, and substituting Y4 and Y, for 
Xi and X, to represent predicted and initial 
scores of nonpromoted children, the equation 
became Y3 -.'700Y, «19.5. This equation 
was then used to predict the scores for each 
member of the nonpromoted group by substitut- 
ing initial scores for Y,. Thus, the predicted 
score for the first nonpromoted child, whose 
initial Total Adjustment score was 61, wascom- 
puted to be 62.2. It was found that the nonpro- 
moted group, on the final testing, fell short of 
the predicted expectancy by an average of .465 
raw score points per child. 

In order to find the statistical significance 
of this difference, the standard error of the dif- 
ference was first computed using the formula 


— t ——— 
Nx Nyd (Nx- Ὥσξ i 
where the subscripts i and f stand for initialand 
final scores, respectively, and where x refers 
to the promoted group and y the non promoted. 
By substituting values where o?y - y! became 
9. 40, the equation was as shown at top of next 
page. Dividing the mean difference by the stand- 
ard error of this difference produced a critical 
ratio of 0. 258 as follows: 
.485 


C. Re = 1.882 - 0.258. 


The probability (from tables) of obtaining a dev- 
iate greater than this value in either directionis 
.794, an extremely large probability. It may be 
concluded that there are no significant differences 
between the promoted and nonpromoted groups 
for Total Adjustment on the California Test of 
Personality. The null hypothesis that there are 
no differences is retained. 

Self and social adjustment scores. —The pro- 
cedure for using the regression technique and 
testing the significance of the difference between 
predicted and attained scores has beendescribed 
in considerable detail, since this was the pro- 
cedure followed in analyzing major group differ- 
ences on all three types of data, The correla- 
tion of initial and final Self- Adjustment scores 
of promoted children was , 518. Substituting this 
figure together with data in the regression form- 
ula, produced the prediction equation Y = .641Y, 
* 11.63. Nonpromoted group scores were then 
predicted from this equation. Attained scores 
fell short of predicted scores, on the average, 
by 0.673 raw score points. The standard error 


808 JOURNAL OF EXPERIMENTAL EDUCATION 


ao .4* (1 - 583? 
epp. [12.8 (= 5934) | 


of this difference was 1. 132, which, divided in- 
to the mean difference, produced a quotient of 
0.595. Reference to a table of integrals of the 
normal curve showed that, if the true difference 
were zero, a difference of .595 standarderrors 
could be expected in , 548 of the trials while the 
opposite could be expected in . 452 of the trials. 
Again, the null hypothesis that there are no dif- 
ferences between the £roups on the Self-Adjust- 
ment phase of the California Test of Personality 
is retained, 

The shift from initial to final mean scores 
favored the promoted group both for Self- and 
Social Adjustment but to a Slightly greater de- 
gree for the former, There was no need, there- 
fore, to analyze the data for Social Adjustment 
any further. It was assumed that the difference 


curred on the subsecti 


Analysis of Items 
Purpose of the analysis, — i 

in τπτ : —It was pointed out 
Califor 


ni hat not all of the subsections o n 


nia Test of Personality are di 

alifi Screte, 
Certain items can be transferred from one cate- 
gory to another without 


co icti i 
of their new subsections, "This erus, tte p 
detracting from the value of the Subsectio d ile 
not change the validity of the item hi gei 
ed on its own merits. Therefore 
tion of the distribution of te του 
the 96 items constituting the test as euch of 
in order to further the Purposes of thig Stud: ken 
the following ways: y in 


(Vol. XXII 


9.40? , (61.9 - 60.7)? 12.4? (1 - .583?) 


54(10.32) 


1. To identify any Single items that clearly 
discriminated between the groups. " 

2. To identity any patterns of group differ- 
ences through meaningful combinations of dis 
criminating items. - 

3. To identity possible leads for further ex 
ploration or research. . items 

Results of the analysis. — The following i 
were selected as offering greatest promise ad 
discriminating Significantly between the group 
in the area of Self-Adjustment: 1 and 4 unner 
Self-Reliance; 5 and 6 under Sense of P 
Worth; 1 and 6 under Feeling of Belonging; dene 
and 4 under Freedom from Withdrawing Ten e 
cies; and 2 and 4 under Freedom from v aiam 
Symptoms, The following items were seleri 
as revealing greatest differences between Ak p 
groups in Social Adjustment: 1,4and7 un 15; 
Social Standards; 2 and 7 under Social Ski E. 
2,4 and 8 under Freedom from Anti-Social 1,2 
dencies; 2,3 and 8 under Family Relations; ^» 
4 and 5 under School Relations; and 6 and 8 u 
der Community Relations. Whenever a gr sca 
changed its score by eight or more ντος 
«LY given item from initial to final testing, he 
item was included in the above list unless t ἃ 
other group also changed its score in the ο... 
direction, Several, but not all, of the ite Que 

& à shift of seven persons pers 

rial purposes. Had such a sus nf 
Proven statistically Significant, the other i ces 
also would have been tested. In three ιν 
(items 6 under Sense or Personal Worth, 1 Ὁ 
der Feeling of Belonging, and 7 under Soci 
Skills), items were chosen when they repre? 
ed ranges of only six persons between the ae ο 
of initial and final testings or between group 
any One testing, These are items on wine 
of one or both groups 56 zi that 
found, as is to be expectet ant 
ller shift to produce poor 


Se circu ces than W^ gis“ 
the members of thi pese iy? 


: right and wrong answers. dif” 
f Techniques for testing the significance a 
lerences, — Testing the s ob 
5 τρ Presented the complex [^ 
ο tings h 
single int groups on two testings $? zc | 


ent“ 
its 


testi 
e groups on one a 
© SO on the other, The other S ny 
used to test Shifts made by the children 


MÀ 


June, 1954) 


One group, no matter how the two groups com- 
Pared on either testing. 
. The data for item 6 under Feeling of Belong- 
ing are used to illustrate the first of these tech- 
niques. The first time the test was administer- 
ed, 33 promoted and 46 nonpromoted children 
answered the question in the way deemed desir- 
Able by the test-makers. On the second testing, 
OWever, 37 promoted and 37 nonpromoted child- 
$n ànswered the question in this way. The lat- 
τν Group dropped 9 persons, while the former 
InCreased 4; differences apparent on the first 
1 ting no longer existed. The 33 promoted child- 
en represented 60. 0 percent of the group; the 
ὅς nonpromoted children represented 83. 6 per- 
e Of their group. The difference between the 
P. Percentages was 23.6. The standard error 
15 difference was determined from the form- 


r 


oD% = 100 / Pıtı + P242 


N, Na 


w 
here p, and αι are the proportions of the pro- 


ben &roup, and where p; and qz are the pro- 

tons of the nonpromoted group scoring right 
the grong. This is the formula for computing 
unco tandard error of the difference between two 
ea Telated percentages (19). Substituting, the 
Wation became 


°D% = 100 / .964x.036 , 2855x145 = 5.38 
55 5 


Tha Critical ratio then, was 10.3 22.03. From 
a 
ü TN of integrals, the probability of obtaining 
„004 Ence of such magnitude was found to be 
at ορ This difference, therefore, is significant 
Dnsideraply better than the 1 percent level.* 

of gm Other technique is illustrated by analysis 
draws ta for item 4 under Freedom from With- 
eq Pit ὃ Tendencies. On this item, the promot- 
Όντο increased from 18 children answering 
ing beds Y on the initial testing to 19 so answer- 
Bron. the end of the year. The nonpromoted 
Durpb^ Meanwhile, dropped from 26 to T. The 
this 95€ this time is to test the significance of 
Cent Πθηρτοτηοίθα group decline. Only 12.7 per- 
ing Of the nonpromoted group on the final test- 

' àS compared to 47. 3 percent at the begin- 


*It 


p: 9 Statistical significance. 
roO tion rather than the observed propo 
teppa νο results, Examination of the prese 
ο ολα technique resulted in only some minor C 
Course cance levels (1 and 5 percent). 


GOODLAD 309 


ning, scored correctly on this item, a differ- 
ence of 34.6 percent. Again, the problem was 
one of testing the differences between propor- 
tions. This time, however, since the group of 
55 children was the same on both trials, a cor- 
relation factor was involved. However, thecor- 
relation coefficients indicating the amount of 
agreement between first and second testings on 
single items were not available. Therefore, in 
order to employ a conservation computation, 
this correlation was assumed to be zero. The 
equation for computing the standard error of the 
difference between correlated proportions is 


= 100 Pidi , Pede -ər Pidi P2d2 
oD% / N, je Ne T px 


where r, is the correlation between initial and 
end nonpromoted scores on the given item. Ob- 
viously, the value of the entire expression in- 
volving the correlation coefficient is zero. Sub- 
stituting for the selected sample, the equation 
became 


" .413X.527 , .127X.873 -οἱ 379 
oD% = 100 Ὃ ας (319) 
[A18 x.5271X.127X.8T3 . 6.52 
55X55 


Dividing this figure into 34.6, the difference, . 
produced a critical ratio of 5.31. The probabil- 
ity of obtaining a deviate greater than this value 
in either direction is less than . 001. The null 
hypothesis that there were no differences between 
the groups is rejected and it is assumed that 
there was a statistically significant difference 
in favor of the promoted group on this item. 

Results of the statistical analysis. — Items 
which discriminated significantly between the 
groups at or within the arbitrary confidence 
levels of 5 and 1 percent are listed below. The 
list represents a summary of all significant dif- 
ferences discovered from utilization of the Cal- 
ifornia Test of Personality. 

The following items represent differences 
that are significant at better than the 1 percent 
level of confidence: 


1. Do the children ask you to play withthem? 
The nonpromoted group gained significantly (0.4 


has ferences were requi 
Produg been pointed out that — Se eee labo £86 SUE cones, ice Dypedhakiess Crue 


tion be used, a procedure that produces slightly more con- 
nt data, however, revealed that utilization of this al- 
hanges that were taken care of within the broad 
This technique is described by E, F. Lindquist, A First 


red at the tails of the distribution to 


in Statistics (Boston: Houghton Mifflin Coe, 1942), pp. 125-129. 


310 JOURNAL OF EXPERIMENTAL EDUCATION 


m initial to final testing. However, 
Moped insignificant gain on 
this item for the promoted group must be taken 
into consideration in interpreting this difference. 

2. Do you think other children do not like 
you? The promoted group gained Significantly 
(0. 4 percent) while the nonpromotedgroup drop- 
ped on this item, both Scoring evenly onthe final 
testing. 

3. Do you like to Stay away from many of the 
children? Slightly more nonpromoted than pro- 
moted children answered this question correctly 
on the initialtrial, a difference that was not statis- 
tically significant. However, on the final trial, 
more promoted than nonpromoted children an- 
swered correctly, a difference that Was signif- 
icant at the 0.4 percent level. 

4. Would you rather think about nice things 
than play? The nonpromoted group dropped sig- 
nificantly (better than 0.1 percent), while the 
promoted group improved slightly from initial 
to final testings. 

9. Do you have man colds? The promoted 
group improved Significantly (0. 6 percent) while 


the nonpromoted group declined almost as sig- 
nificantly, 


6. Should one be nicer to bright children than 
to others? The promoted group increased sig- 


nonpromoted &roup declined, 
was still Slightly ahead on the 


3. Should a person alwa S be nice to those 
who win from him in games? The nonpromoted 
group remained constant from first to final test- 


ings on this item, while the promoted group in- 
creased significantly (2. 0 percent), 


(Vol. XXII 


4. Do you tell mean children what you think 
of them? The promoted group improved signif- 
icantly (4.4 percent) from initial to end trials, 
while the nonpromoted group declined almostas 
markedly. 


5. Do many of the children start quarrels 


with you? The nonpromoted group declined 
Slightly, while the promoted group improved 
Significantly (4, 4 percent), 

6. Do people try to argue with youa great 
deal? Slightly fewer promoted children ans- 
wered correctly on the final trial than had ans- 
Wered correctly on the initial trial; markedly 
fewer nonpromoted Children answered correct- 
ly on the final trial (2.4 percent). 5 

7. Do you feel that no one loves youat home? 
The nonpromoted group fell only slightly from 9 
first to final testing on this item, while the pro 
moted group fell Significantly (3. 2 percent). 

8. Is it hard to talk things over with your 
folks because the don't understand? The pro- 
moted group increased Slightly on this item 
while the nonpromoted group declined signifi- 
cantly (2.0 percent), " 

9. Are you often unha because of gettin | 
low marks in School? Slightly fewer nonpro- 
moted children answoren this question correct- 
ay on thie final ata] than on the μαμα celal, a 
T (3.2 percent) promoted chi 


dren answered Correctly on the final trial than 
On the initia] trial, 


10. Do your classmates often say things that 
hurt your feelings? 


The nonpromoted group de 
clined on this item while the promoted group 
increased Significant]y (estimated by compari- 
son With similar shifts on other items to be at 
the 3.0 Percent level) 


11, fio 9u think that most of the children in 
Gem οσο are trying to keep you away from 
them ? While the nonpromoted group declined — 
on this item, the Promoted group increased a | 
nificantly (estimated by comparison with simila 


gains on other ite ent 
level). ms to be at the 3. 0 perc 


12. Are conditions of your neighborhood aS E 
Promoted group dropped only very slightly 0n 
this item and the Promoted group dropped sig“ _ 
nificantly, a drop that was estimated by compa? 
vue μμ Mer iteme to'be armo td percent 


Twenty items from the California Test of Per 
Sonality have been identified as discriminating | 


June, 1954) 


terns and are interpreted in the light of the ma- 
Jor hypotheses in Section IV. ` 


SECTION IV 


ANALYSIS OF THE PEER-RATING DATA 


Introduction 


Ts Section presents and analyzes the re- 
chia of the sociometric questions, whereby the 
s: dren indicated their acceptance or rejection 
p ene another. Each child in the twenty-three 
ha ticipating classrooms was asked to nam e 
three children he would like to have and the 
"ep Children he would not like to have as very 
"a friends. These questions were asked both 
Behr, the beginning and towards the end of the 
μμ year. When all the data had been com- 
€d, a few gaps were apparent, making it im- 


Possible to present complete sociometric results. 


veSequently, all computations in this section 
€ based upon 52 promoted and 51 nonpromot- 


ed children, 


Sociometric Findings 


, dàture of the data. — The sociometric data 
cents eS ented in both percentile and raw per- 
Child £e form. Both were computed for each 
roo On the basis of the finings for the class- 
Peres of which he was a member. Thus, for 
in dautiles, each child in the room was arranged 
and eScending order of total choices received 
to then a percentile rank assigned according 
ered. Position in his sex group. It was consid- 
Ing Preferable to rank separately for eachsex. 
Pie on where 60 of the total choices were 
fro, ved by boys, a boy receiving 2 choices 

che boys and 1 from girls was credited with 
μον Or 5 percent of the toal and was as- 
a a percentile rank according to his place- 
Τη ΟΗΕ males. 
i» 56 data indicate a given child's relation- 
i to his peers in two different ways. Percent- 
o με, quickly how a child ranks in relation 
€ others of his group. However, this is a 
hides, relationship computation that completely 
tribut”? nature of the original quantitative dis- 
iffe lon. It hides, for example, quantitative 
te, 5665 where two children, one receiving 
e Choices and the other only four, are bothat 
Erou; inetieth percentile in their respective 
ent fe Such illustrations occurred in the pres~ 
chita when, in one classroom, two or three 
be, apt Secured a disproportionately large num- 
lie d Choices or rejections while, in another, 
Ove elections were spread out fairly evenly 
Scop the entire group. The use of percentage 
55 tends to remedy this situation. 


GOODLAD 311 


Percentile scores. — High scores for choices 
and low scores for rejections are indicative of 
desirable social adjustment. In choices, the 
promoted group ranked below the theoretical 
mean of 50 for a normal distribution but im - 
proved from a mean of 37.2 to one of 42. 4. 
The nonpromoted children, meanwhile, moved 
downward from a mean percentile score of 55.1 
to one of 45.9. In rejections, the promoted 
group improved its position by dropping from 
51. 6 to 44.5, while the nonpromoted group re- 
mained constant with means of 57. 4. The trends 
were to the advantage of the promoted group in 
allinstances. However, in spite of its substan- 
tialdropin relative popularity, the nonpromot- 
ed group maintained higher percentile means 
for choices at both first and final testings. Ad- 
versely, this situation was true for the nonpro- 
moted children in regard also to their rejection 
picture. 

In regard to choices, 29 promoted children 
improved over their initial positions, 22 de- 
creased in their group standings, and1 remained 
the same. In rejections, 21 promoted children 
improved their positions by obtaining a lower 
score, 18 received a higher rejection rating, 
and 3 remained constant. Twenty-four nonpro- 
moted children improved their standing in 
choices, while 27 lowered in rank. Twenty-five 
of this group improved their rejection picture, 
while 23 shifted for the worse, and 3 revealed 
no change. Once again, the nature of the shifts 
favored the promoted group. 


Percentage scores. — The percentile findings 
now may be compared with the results expres- 


sed in percentages of the total selections made 
in each classroom. The promoted group in- 
creased from a mean of 2. 56 percent to one of 
3.96, while the nonpromoted group decreased 
from 6.30 to 6. 11. Although the direction of 
the shift again is to the advantage of the promot- 
ed group, the nonpromoted group maintained 
considerably higher mean scores at both fall and 
spring testings. The higher means of the latter 
were produced in part by a few scores scatter- 
ing well above the bulk of the scores in the dis- 
tribution. Theshift in both groups from initial 
to final testings was very pronounced at the ex- 
treme lower end of the distribution. Although 
26 promoted children received scores of less 
than 2 percent of all the choices received by 
their sex in their rooms when questioned in Oc- 
tober or November, only 11 were so rated in 
May. Only 5 nonpromoted children fell into 
this category when the sociometric questions 
were first asked, but 17 occupied this position 
when the questions were repeated. It is ob- 
vious that the number of social ‘‘isolates’’ and 
‘fringers’’—that is, the number of children 
not wanted or little wanted for friends—in- 
creased markedly for the nonpromoted group 


819 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXII 


and declined considerably for the promoted 
group. . ? . 

The trend indicated by the shifts in mean per- 
centages is verified by the shifts made by indi- 
viduals. Thirty promoted children to 20 non- 
promoted children improved their standing, 12 
promoted to 24 nonpromoted cases lost ground, 
and 10 promoted to 7 nonpromoted children re- 
mained constant. This situation clearly is to 
the advantage of the promoted group. 

In regard to rejection, the promoted group 
improyed by dropping from a mean percentage 
of 6. 94 to one of 5. 15. Meanwhile, the means 
for the nonpromoted group shifted from 6.94 to 


Statistical Significance of the Findings 
Te ue rindings 


Percentile scores. —Initial and final accept- 
ance scores for the promoted group correlated 
-327. Nonpromoted group end scores were pre- 
dicted from the equation Yi = .38Y, + 28. 41 and 
exceeded actually attained end scores by the 
mean difference between the two of 1.24. This 
difference divided by its standard error of 5.83, 
produced a critical ratio of .213. The proba- 
bility of obtaining a deviate greater than this 
value in either direction is .834. The null hy- 
pothesis that there are no differences between 
the promoted and the nonpromoted children on 
percentile ratings for socia] acceptance is re- 
tained. 

In regard to the rejection picture, correlat- 
ing initial and end Scores of promoted children 
produced a coefficient of .358. Theoreticalend 
Scores for nonpromoted children were predict- 
€d from the equation Y2 = .36Y, + 5.84. Attained 
Scores exceeded predicted scores by a mean dif- 
ference of 11, 28, In this case, an undesirable 
Situation is implied by attained scores being in 
excess of predicted Scores, Since the factor in- 
; ial rejection, The standard error 
of the difference was 4.99, which, when divided 
Ce, produced a critical ratio of 
2.26. The probability of obtaining a deviate 
greater than this value in either direction is _ 
1024. The null hypothesis that there are no dif- 
ferences between the promoted and the nonpro- 
moted children in regard to sociometric rejec- 


tion may be rejected at the 2. 4 percent level of 
Statistical Significance, 


Percentage Scores, —Q 
end raw percenta, 
children Correlated , 4 
Scores were predicted 


«495 + 2. 69 and fell short of actually attained 
scores 


data revealed trend This ti yan average of -273 points per pupil. 


: ; Since social acceptance is being con- 
Sidered. The Standa. 


Was 1.05, Mean diff 


Are these findings Statistically Significant? 
Would they be reversed with Subsequent exper- 
imentation? Answers to these questions were 
Sought once again through application of the res 
gression technique, as described in the prey- 
ious section, and are presented in the succeed- 
ing section. 


nonpromoted group were pre^ 
uation Y} = .95Y, + 2.65, 


The standard error 
Ένας 1.75 which, when divid- 
9duced a critica) ratio of 1.18. 


c iani 


Tune, 1954) 


This difference, then, favoring the promoted 
group, is significant at about the 7. 0 percent 
level and is not great enough to be accepted with 
the conventional degree of confidence. However, 
differences of this magnitude must not be over- 
looked, since they serve a definite purpose in 
Strengthening or balancing differences identified 
by other means. 

Summary. — The foregoing analysis of the 
Statistical significance of sociometric findings 
revealed the following: 


1. For acceptance as revealed by choices re- 
Ceived: a slight but statistically insignificant 
Advantage for the promoted group on data com- 
Puted on a percentile basis; a slight and equally 
Insignificant advantage for the nonpromoted group 
9n data computed in percentages. 

2. For rejection: advantages for the promot- 
ed group on both percentile data (92 percent lev- 
» νας and percentage data (7 percent 

vel), 


An Index of Adjustment 
Computation of the index. — The analysis pre- 


Sented in the preceding section was based, in 
Part, on the assumption that position of two 
Sroups on a continuum of sociometric accept- 
ance or rejection is indicative of relative adjust- 
ment, High scores for acceptance and low scores 
for rejection, then, indicate a desirable picture 
St adjustment, while low scores for acceptance 
4nd high scores for rejection indicate a less 
desirable picture. In certain instances, these 
tatements might not apply. It might be better 
ΟΥ the development of the individual concerned, 
for example, to be sufficiently ‘‘outwardgoing”’ 
© incur dislike than to be so completely ‘‘color- 
less" as to draw neither the acceptance nor the 
rejection of his peers. Applied in a general way, 
os Never, the assumption appears to be a sound 
ne, 

Data for acceptance and rejection were anal- 
YZed separately in the preceding section. No 
attempt was made there to pair choices and re- 
J€ctions for any given individual to show the com- 

ined picture of adjustment. Such an attempt 
Was made by computing an ‘‘index of total ad- 
Justment” for each child, using the percentile 
It will be recalled that percentile ranks 
ere assigned to boys, for instance, according 
to their relative positions among all the other 
Oys in their own classrooms. A boy receiving 
à percentile rank of less than 25 for choices, 
then, ranked in the bottom quarter of his class 
8roup and was assigned an acceptance index of 
* Those with ranks of from 26 to 50 were in 
“16 Second lowest quarter and were assignedan 
Index of 2, and so on. For rejection, however, 
low score is more desirable than a high one, 


GOODLAD 313 


according to the assumption expressed above. 
Low scores for acceptance and high scores for 
rejection represent two criteria of undesirabil- 
ity and so were equally weighted. Childrenwith 
percentile ratings for rejection of 76 or more 
ranked in the top quarter and were assigned an 
index of 1. Thus, a child ranking in the bottom 
quarter for rejections received an adjustment 
index of 4. A child in the top quarter for accept- 
ance and the bottom quarter for rejection re- 
ceived the highest possible score of 8. In this 
fashion, a social adjustment index, for socio- 
metric percentile data, was assigned each of the 
52 promoted and 51 nonpromoted children for 
whom complete data were available. 

The findings. — The promoted groupincreased 
its mean score from 4. 42 to 5. 00; the nonpro- 
moted group dropped from 4.94 to 4. 71. This 
shift is not emphasized at any particular point 
on the distribution but appears throughout. 
Although 27 promoted children received scores 
of 4 or less on the initial testing, only 18 were 
so rated in May. Meanwhile, the number of non- 
promoted children in these categories changed 
from 21 at the beginning to 24 at the end. Thir- 
teen promoted children compared to 19 nonpro- 
moted children scored 6 or better on the initial 
questioning; 22 of the former and 19 of the latter 
were so rated in May. Of the 52 promoted chil- 
dren, 27 increased their standing, 12 moved 
downward, and 13 remained constant. Of the 
51 in the nonpromoted group, 17 improved, 26 
lost ground, and 8 retained their positions. The 
total picture of change is to the advantage of the 
promoted children. 

Correlating initial and final scores for pro- 
moted children produced a coefficient of . 881. 
Substituting this figure, together with means and 
standard deviations in the regression formula 
produced the equation Y3 = . 433Y, + 3.08 from 
which nonpromoted scores were predicted. The 
mean of these predicted scores was 5.22 and 
exceeded final attained scores by .513 index 
points. By means of the formula described in 
Section III, the standard error of this difference 
was computed to be . 444. Dividing the mean 
difference by its standard error produced acrit- 
ical ratio of 1. 16. The difference, then, issig- 
nificant at about the 25 percent level of confi- 
dence. The hypothesis that there are no differ- 
ences cannot safely be rejected on this evidence. 


An Index of Outwardgoingness 


Computation of the index. —Earlier in this 
section, it was stated that **the nonpromoted 
children, over the promoted children, tended 
to be selected relatively more often by class- 
mates as being both wanted and not wanted for 
very best friends. " Was this dual trend a sig- 
nificant one? 


814 JOURNAL OF EXPERIMENTAL EDUCATION 


er to the above question was Sought 
μον A of an “index of outward- 
goingness. ’’ The quoted statement implies that 
the nonpromoted children ‘‘made contact’ with 
their peers, for good or bad, more Írequently 
than did the promoted group. i 
The index of outwardgoingness and the index 
of adjustment were computed in a similar way, 
with one essential difference, Whereas the in- 
dex of adjustment decreased with increase in 
the percentile rating for rejection, the index 
of outwardgoingness increased with increase in 
the percentile rating for rejection. Thus, a 
percentile rating for rejection of 82, which re- 
ceived an adjustment index score of 1 as de- 


it may be seen that, accord- 
ing to the criterion used, both groups declined 
in outwardgoingness. The decline was greater, 
however, for promoted than for nonpromoted 
children. 
Statistical significance of the findin S.— 
to determine the statistical Significance of the 
differences indicated. Initial and end scores for 
the promoted group correlated -248. This is an 
exceedingly low coefficient to be used for predic- 
tive purposes; obviously, Significance tests made 


(Vol. XXII 


from the use of this Coefficient in the equation 
can be only a slight improvement over tests 
made by much simpler methods such as ‘‘t’’- 
test. Substituting this coefficient, together 
with appropriate means and standard deviations 
derived, in the prediction formula produced the 
equation Y3 = .227y, + 3, 26, from which final 
Scores for each nonpromoted child were pre- 
dicted. Attained scores exceeded predicted 
Scores an average of . 651 index points. Divid- 
ing this difference by its standard error of .267 
produced a critical ratio of 2.438. That the 
difference is a true one, then, can be accepted 
at the 1. 4 percent level of confidence. The null 
hypothesis that there are no differences between 
the groups on the factor of outwardgoingness, 
as measured by the index, is rejected. 


Mutual Choices and Rejections 
----------5 and Rejections 


The need to examine mutual bonds, —A single 
bond of acceptance or rejection running from 


plete picture of Social adjustment, According 


<+». the personal influence of a 
member of the class must 


bonds Were maintained by boys, 2 by girls, and 
the remaining 5 by boy-gir] pairs. For rejec- 
tion, 3 bonds involved boy pairs and 2 werecon- 
fined to girls, By the end of the year, the pro- 
moted group had doubled the bonds of mutual 


acceptance, 27 such bonds involving 24 promot- 


E du 


Tune, 1954) 


ed children. Six bonds of mutual rejection, 
meanwhile, were shared by 5 promoted young- 
Sters. Sixteen of these mutual acceptance bonds 
Were maintained by boy pairs, 9 by girls, and 
2 by mixed pairs. 

For the nonpromoted group, the trend was 
Quite different. At the beginning of the year, 31 
nonpromoted children shared 41 mutual accept- 
ance bonds; 22 were boy pairs, 17 girls, and 2 
Were mixed. Twelve bonds of mutual rejection 
M Shared by 11 nonpromoted children, 5 
es involved only boys, 3 involved girls and 

both boys and girls. By the end of the year, 
Only 31 mutual acceptance bonds were main- 

ined, made up of 16 pairs of boys, 12 of girls, 
and 3 of mixed sexes. Eighteen nonpromoted 
phildren were involved. The rejection picture, 
Owever, showed little change, 13 bonds being 
Maintained by 12 nonpromoted children, 9 of boys, 
of girls, and 3 of both sexes. 

These results largely uphold the assumptions 
Put forward at the end of the preceding section. 
inis nonpromoted than promoted children (31 
at ul were involved in bonds of mutual choice 
s le beginning of the year, a finding that is 
weniicant at better than the 0. 1 percent level. 
Dp promoted than nonpromoted children (24 

18) established mutual acceptance bonds at 
M €nd of the year; this difference is reliable 
a about the 24 percent level of confidence. Ob- 

ously, the trend is a significant one favoring 
ieee omoted group. The picture of mutual re- 

e don changed but little for both groups, each 
ap UE one more bond at the end of the year than 
the beginning and the nonpromoted group es- 
lishing more on both occasions. 

" One other difference in group patterns war- 
ants attention, On both trials, the ratio of mu- 
l choices to children sharing those choices 
Was closer to 1 for the promoted than for the non- 
Promoted group; that is, fewer promoted than 
°npromoted children tended to be involved in 
more than one mutual acceptance bond. Further- 

gore, the decline in mutual relationships for 
h nonpromoted group was markedly greater 
Sg measured by number of children involved 
Th by number of acceptance bonds established. 
mot ratio of bonds to children for the nonpro- 

t ed group was greater at the end of the year 

Nat the beginning. 

ῃ Qualitative analysis of mutual bonds. — The 
Toblem arose as to whether or not one group, 
More than the other, tended to be involved in 
orutual bonds with children who were accepted 
rejected a disproportionate share of the time. 
erefore, percentage and percentile data for 
μα Acceptance and rejection for all children 
(Πνοῖνθα in mutual bonds with the promoted and 
‘Onpromoted children were tabulated. 
T The trends were very similar for bothgroups. 
he children with whom the promoted and non- 


GOODLAD 


315 


promoted children maintained mutual bonds of 
acceptance at the end of the year were accepted 
more and rejected less than were those with 
whom such bonds were maintained at the begin- 
ning of the year. A similar situation prevailed 
in regard to mutual rejection bonds when the 
data were computed in percentages. For per- 
centile data, however, the children at the other 
end of the rejection bonds in May were both ac- 
cepted and rejected less than were those in- 
volved in October. Such differences as were 
present between promoted and nonpromoted 
groups in regard to the acceptance and rejec- 
tion pictures of those children with whom they 
established mutual bonds were not significantat 
any of the conventionally accepted levels. 

One other question that arose was whether 
or not nonpromoted children tended to seek out 
one another as mutual friends to a greater de- 
gree at the beginning than at the end of the year. 
Similar information was not sought for the pro- 
moted children, since there was no way of ident- 
ifying other children of like progress in their 
rooms. A check revealed that, of the 41 mutu- 
al acceptance bonds established by nonpromoted 
children at the beginning of the year, 18 were 
with other nonpromoted children. At the end of 
the year, however, only 8 of the 31 mutual ac- 
ceptance bonds involved pairs of nonpromoted 
children. To be accurate, the significance of 
this difference should be tested using the tech- 
nique for comparing correlated proportions, 
since the sample of cases involved is the same 
for both trials. However, the amount of cor- 
relation being unknown, the technique for uncor- 
related percentages was used and a confidence 
level of 9. 8 percent obtained. Allowing for the 
factor of correlation, therefore, probably would 
raise the level of confidence to better than 9 per- 
cent. Only 2 bonds of mutual rejections involved 
nonpromoted pairs at the beginning of the y ear ; 
there was none at the end of the year. 


Summary 


The foregoing analysis of sociometric data 
revealed the following statistically significant 
trends from initial to final ratings: 


1. The promoted children were named less 
frequently than the nonpromoted childrenas per- 
sons not desired for very best friends (2.4 per- 
cent when computed in percentile form and 7.4 
percent when computed in percentages). 

2. Greater social ‘‘outwardgoingness”’ for 
the nonpromoted group (1. 4 percent), outward- 
goingness being computed from a percentile in- 
dex combining both social acceptance and social 
rejection. 

3. More mutual acceptance bonds for non- 
promoted children at the beginning of the year 


JOURNAL OF EXPERIMENTAL EDUCATION 
316 


bonds 

t); more mutual acceptance 

Feb ashes. Eo ri the endof the year (23.8 
ο 

rmt for the nonpromoted children 
to seek out one antoher as mutual friends more 
often at the beginning than at the end ofthe year. 
(This difference was significant at only the 9.8 

ercent level, using the technique for compar- 

eie uncorrelated percentages, but Should be ad 
justed upwards to take into account the correla 
tion factor. ) 


SECTION V 


ANALYSIS OF THE TEACHER-RATING DATA 


Introductory 


This section compares the promoted and 


Teacher-Ratin Findings 
ng Findings 


Behavior, — Schedule A of the Haggerty-Olson- 
Wickman Schedules Consists of fj 


problems. The problems are 
frequency of occurrence and for 
the individual as judged by teachers, 
ing is such that high and low Scores in 
desirability and desirability of behavi 
tively. 

The scores on Schedule A for the promoted 
and nonpromoted Sroups are presented in Table 
II (see page 305). Meanscores of 26.0for promot- 
edchildrenand 32.0 for nonpromoted child ren 
are tothe advantage of the former, 

Social and emotional. —Schedule B is com- 
posed of thirty-five traits wei 
their relationship to the over 
of Schedule A. Only two of t 


The weight- 
dicate un- 
or, respec- 


groups on the total for Division ΠῚ 
Division IV, Emotional, of Schedule B, The 
lower mean score of 46, ϐ for the nonpromoted 
group, as contrasted to that of 47. 4 for the pro- 
moted group, is slightly in favor of the former, 
Table IV compares the two £roups on these two 
divisions taken separately. 


; Social, and 


(Vol. XXII 


The promoted group, with a mean of 24. Ton 
Division III and a mean of 22.5 on Division IV, 
compared with means of 23.7 and 22. 9 for nm a 
nonpromoted group, has a slight edge in ' 
to emotional development but shows upless we 
on the social side. 


Statistical Significance of the Findings 


The technique. —Analysis of data collected 
from utilization of the Haggerty-Olson-W ick 
man Behavior Rating Schedules presented a d 
Slightly different problem from that presente 
by the self- and peer-rating data. In this 18 1 
stance, only end scores were obtained. Simple 

omparison of differences between means ee 

ota Satisfactory procedure Since it did not m 
into account any initial differences that may τς 
been present at the beginning of the year. - t 
closest estimate of these initial differences tha 
could be made was the performance of the m 
Sroups on some other criterion of να z 
The higher the correlation between Hagg eno 
Olson-Wickman Scores and initial scores on $ 
Selected Criterion, the more satisfactory Ὃ πη 
Criterion would be as a yardstick for estimati 5 
the significance of group differences in the for 
mer. Consequently, the performances of the 1 
promoted children on Schedule A, Divisions es 
and IV of Schedule B, and the total of Divisions 
III and IV, were Correlated with their perform 
ance on all other evaluative criteria used. In 
all instances, Promoted children’s Haggerty 
Olson-Wickman Scores correlated more highly 
with their initial scores for sociometric rejec y 
tion (computed in raw percentages) than withan. 
other single Criterion. The four coefficients 


Were then used in the equation, described in m 
two preceding Sections, to determine the regr | 
Sion of promoteq &roup Behavior Rating Seale | 
Scores upon promoted Sroup initial sociomett? 
rejection (raw Percentage) scores. Haggerty- 
for nonpromoted chila 

m the regression egua e- 
; ir initial sociometrie T 
jection scores, Predicted and attained scores n 
were then compared and the significance of me 


aS previously described. : 


moted children were predicte 
from the formula Yi = 1.28Y, + 17.11. Attain 


Scores exceeded predicted scores by the vea 
his excess was to the à f 


The mean difference, divided by its standard 1 
error of 4,35. Produced a critical ratio of 1.01 
Prom the table, the Probability of obtaining à .- 
deviate greater than this value in either direc a 
Honis.312, The null hypothesis that thereat 


ο Tune, 1954) GOODLAD 


TABLE III 


THE HAGGERTY-OLSON-WICKMAN BEHAVIOR RATING SCHEDULES 
TOTAL FOR SOCIAL AND EMOTIONAL DIVISIONS OF 


SCHEDULE B 
Range Promoted Group Nonpromoted Group 
A 84. 5-89. 5 1 0 
το. 5-84. 5 0 0 
14. 5-19. 5 2 0 
69. 5-74. 5 1 d 
64. 5-69. 5 2 4 
59, 5-64. 5 2 2 
54.5-59. 5 1 4 
49. 5-54. 5 9 11 
44, 5-49. 5 12 τ 
39. 5-44. 5 12 11 
34. 5-39. 5 T 9 
29.5-34.5 4 8 
| 24. 5-29. 5 2 8 
' 
Number 55 55 
Mean 47.4 46.4 
S.D. 12.28 10. 78 
TABLE IV 


THE HAGGERTY-OLSON-WICKMAN BEHAVIOR RATING SCHEDULES 
SOCIAL AND EMOTIONAL DIVISIONS OF SCHEDULE B 


Social Emotional 

Range Promoted Nonpromoted Promoted Nonpromoted 
46.5-49.5 0 0 1 0 
43.5-46.5 0 0 0 0 
40. 5-43. 5 1 0 0 0 
37. 5-40. 5 3 0 2 0 
34. 5-37. 5 1 1 1 6 
31. 5-34. 5 2 3 3 2 
28. 5-31. 5 3 7 0 3 
25.5-28.5 10 8 5 6 
22.5-25.5 15 14 12 y 
19. 5-22. 5 10 10 11 13 
16. 5-19. 5 5 6 11 7 
13.5-16.5 5 6 6 8 
10. 5-19. 5 0 0 3 3 

Number 55 55 55 55 

Mean 24,7 23.7 22.5 22.9 

S.D. 6.26 5.22 7.07 6. 83 


318 JOURNAL OF EXPERIMENTAL EDUCATION 


no differences cannot be rejected on this evi- 
uo A B scores. —Nonpromoted group 
scores on the total for Division III and IV of : 
Schedule B were predicted from the formula Y}= 
"4Y, + 42.29. Attained scores fell s hort of 
predicted Scores an average of .541 per child. 
The standard error of this difference was 2.078 
which divided into the mean difference . 260 
times. The probability of obtaining a deviate 
greater than this value is . 794, The null hypoth- 
esis that there are no differences is retained. 

Theoretical scores on Division IH (Social) 
for the nonpromoted group were predictedfrom 
the formula Y} = .38Y, + 22. 05. Again, pre- 
dicted scores exceeded attained scores, this 
time by a mean difference of 1.54. This differ- 
ence divided by 1, 14, its standard error, pro- 
duced a critical ratio of 1.350. This figure 
represents a Statistically significant difference 
of 17.6 percent, a level of Significance not con- 
ventionally accepted as reliable. 

Nonpromoted group theoretical scores for 
Division Iv (Emotional) were predicted from the 
formula yi = -43Y, + 19.51. This time, attained 

Scores .637 points 
The standard error 


jection of the 


Analysis of Items 


Purpose of the anal Sis. — The distribution 
of groups on each item was analyzed in or der 
to: 


1. Identify th 
inated between 
groups. 


ose single items that discrim- 
promoted and nonpromoted 


vealing Significant differences 
between the two groups. These were: disinter- 
est in school work (item 


1), Cheating (item 2 
unpopular with children (item 7), and να 
(item 9). Items 1, with nonpromoted to 24 pro- 
moted children in the ca 


i tegory of less frequent 
occurrence, and 2, with 46 nonpromoted {ο 40 
promoted children thus rated, Suggested a great- 


er tendency towards disinterest in school work 
and cheating on the part of the Promoted group, 
Items 7 and 9, with only 31 nonpromoted to 46 
promoted and 49 nonpromoted to 54 Promoted 


(Vol. XXII 


children in the categories indicating less fre- 
quent occurrence, suggested a greater tendency: 
on the part of nonpromoted children towards a 
popularity with children and bullying. It s i 
interest to note that the two categories, disin 
terest in school work and cheating, bear a re 2 
lationship to one another, as do bullying and un 
popularity with children. 

To determine the amount of confidence to be 
placed in the above differences, the technique 
used to test the statistical significance of the 
difference between two uncorrelated percent- 
ages was applied. The number of capil 
falling into each category was replaced in each 
Case by the percentage so rated and the differ- 
ences were tested. The tendencies of the non 
promoted children to bully and to be unpopular 
with children proved most significant, Merl 
at the 4, 6 and 0.2 percent levels of BRNO. 
confidence, respectively. The two next mos 
significant items—the tendencies of the i 
moted group to cheat more and to be less cd 
ested in their studies —testeg at only the 16s 
and 17.6 percent levels of statistical signifi i 
cance. There was no need, therefore, to an 
amine other items, all of which discriminate 
between the Sroups to an even lesser degree 


than did “disinterested in school work” and 
“cheating, » 


As both a check on the foregoing and an en^ 


deavor to discover other discriminating tem 
certain rating categories once more were iaa 
itrarily combined. It seemed significant to d1 E 
tinguish, for any given item, between the e 
ber of children in each group rated ‘has neve 
occurred” and the number rated for any inci 
dence of Occurrence, whether once or twice, 
Occasionally, or frequently. Therefore, est 
Children rated in the first category were Ln 
; and those rated in the othe 
ee categories as another. Six items n 
were isolated and the Significance of group 4 
ferences was tested, d 
This time, the tendency of the promote! ο 
group to cheat (item 2) was significant at 
1.0 percent level. The ot her items show ® 
greater tendency for the nonpromoted childre® 


Cipline (item 5), unpopular ο 


» Only items 7 and 13 were significa’ y 
at better than the 5 percent level. The tenden 
for the no d group to be more unpoP" 
dren stands out as the ae 

: nce revealed by the item ἃ 
alysis of Schedule A. This difference has 2 
-2 percent probability when analyzed one nd 
and 1. 8 percent When analyzed another. sib 
Analysis of Schedule B items, —The ae 
of group distributions of Schedule B items of h 
an even more Complex problem, since 68 


June, 1954) 


Child was rated according to a weighted, five- 
point scale. This weighting system was consid- 
€red too refined for the purposes athand. Close 
examination of the descriptive categories under 
each weighting revealed an attempt on the part 
of the scale-makers to depict a continuum of be- 
havior, ranging from one extreme through a 
Central tendency to another extreme. Examin- 
Ing item 15, for example, “15 he quiet or talk- 
ative?’’, revealed that it called for a rating re- 
Sponse ranging from ‘‘speaks very rarely” at 
ei extreme to‘‘jabbers’’ at the other. Itseemed 
σα, therefore, to group ‘‘speaks very rare- 
R4 and ‘usually quiet?’ into one category de- 
is pte of individuals who tend to be quiet, and 
ani P **jabbers'' and ‘‘talks more than his 
fà re” into another descriptive of persons who 
Pris to be talkative. Such procedure focused 
a €ntion on the descriptive phrases and elimin- 
€d completely all weighting considerations. 
ας The writer went through items 15 to 35 inthis 
Ἄγ, arbitrarily reducing five categories to 
if Tee. The list of items was examined to ident- 
sd those items that appeared to differentiate 
i Snificantly between the groups. Eight trial 
tary were isolated in this fashion. More pro- 
ai ed than nonpromoted children were rated as 
one ‘inwardgoing, ’’ gracious, and yielding. 
rated nonpromoted than promoted children were 
stolid = being talkative, resigned and compliant, 
eva 1, easily discouraged and cautious. How- 
ed ^s in some instances these differences tend- 
Monts be offset by a reversal of the situation ex- 
tho €d at the other extreme. For example, al- 
re ugh more promoted than nonpromoted child- 
meee rated ‘‘inwardgoing” in their social 
So its, more of the former than of the latter al- 
σαῖς Ὃν rated as being ‘‘outwardgoing. ’’ Rever- 
Suc of this sort put a reservation on statements 
m h as “the nonpromoted children tended to be 
"ies inwardgoing socially, a difference accept- 
a το the 1 percent level of statistical confidence, 9 
in ü Servation that was taken into consideration 
€ final interpretation of findings. 
edi he results enumerated above were convert- 
the Πο percentages and the standard error of 
criti erence between uncorrelated percentages, 
ii ical ratios, and probabilities were computed. 
Y the tendency of promoted children to be rat- 
17) dere inwardgoing in their social habits (item 
Ce ahowed up as a difference that could be ac- 
mura with a high level of confidence (0. 6 per- 
bron, cduced somewhat by a tendency also for 
5 moted children to be rated more frequently 
ed Outwardgoing). The tendency for nonpromot- 
απο ildren to be more resigned in their accept- 
i = of authority (5. 6 percent), to be more stol- 
d temperament (7. 6 percent) and to act more 
ἔπι lously (8.2 percent), were the other differ- 
itio that most closely approximated levels of 
istical confidence conventionally acceptedas 


GOODLAD 319 


reliable. 


Summary 


No statistically significant differences be- 
tween the promoted and nonpromoted groups for 
the Haggerty-Olson-Wickman Behavior Rating 
Schedules were found when total scores for any 
given schedule or subsection were considered. 
However, certain individual items on bothSched- 
ules A and B discriminated significantly between 
the groups. For Schedule A, these were as fol- 
lows: 


Item 2: Cheating. Thirty-seven nonpromot- 
ed as against 24 promoted children were rated 
“has never occurred” (1. 0 percent). 

Item 7: Unpopular with children. Forty-six 
promoted as against 31 nonpromoted children 
were rated ‘‘has never occurred’? or ‘‘has oc- 
curred once or twice’’ (0.2 percent). Thirty 
promoted to 17 nonpromoted children were rat- 
ed ‘‘has never occurred” (1. 8 percent). 

Item 9: Bullying. Fifty-four promoted to 49 
nonpromoted children were rated ‘‘has never 
occurred” (4. 6 percent). 

Item 13: Stealing. Fifty-four promoted to 
47 nonpromoted children rated ‘‘has never oc- 
curred” (1.2 percent). 

For Schedule B, only item 17, **What are his 
social habits?’’, revealed differences that are 
acceptable at conventional significance levels. 
More promoted than nonpromoted children (26 
to 13) were rated ‘‘lives almost entirely to him- 
self" or ‘‘follows few social activities” (0. 6 
percent). However, this difference must be in- 
terpreted in the light of the fact that 10 promot- 
ed to 6 nonpromoted children were rated ‘‘ac- 
tively seeks social pleasures” or ‘‘prefers so- 
cial activities to all else"' on this item. 


SECTION VI 


SYNTHESIS AND INTERPRETATION 
OF FINDINGS 


Purpose of this Section 


The present section is designed to bring to- 
gether differences between groups that have 
been identified, to combine related differences 
into appropriate patterns, and to interpret the 
meaning and significance of these patterns. 

First, an index of total adjustment was 
worked out to show both the consistency of 
individual scores and the total difference be - 
tween groups on all instruments. Two ques- 
tions were thus answered: 

1. Did the same promoted or nonpromoted 
children tend to score consistently low on all 


890 JOURNAL OF EXPERIMENTAL EDUCATION 


i theirlow 
i ents, or did they balance } 
ae roe on one instrument with high ones on 
2 

ma the pooling of all total scoreson each 
instrument reveal significant differences be- 
tween the groups? 

"Then, a number of hypotheses were put for- 
ward to explain the presence of Significant dif- 
ferences in the light of the promotion factors. 


An Index of Total Adjustment 


The next Step in the computation of the total 
sti 


ment index was the Computation of an in- 
dex score for each child 


ometric accept- 
ὶ The maximum 
and minimum possible Scores for the total in- 
itial testing were, therefore, 16 and 4 respec- 
tively. For the final testing, indices for the 
same instruments, together with 


indices for Di- 
visions III and IV of the Haggerty-Olson-Wickman 


#It will be recalled from Section 
were available. 


ee 


(Vol. XXI 


Scales, were computed. Maximum and mini 
mum Scores for the final testings, therefore, 
were 24 and 6. Total adjustment index ΕΠ 
ranged from 4 to 16 for the initial trial an 
Írom 6 to 24 for the end trial. i 
Total index scores for each child naning 
been allotted in the manner described a hen 
promoted and nonpromoted children were 2 
assigned to separate distributions. The de 
initial index score of 9.58 for promoted .- 
ren was slightly less than the mean of 10.0 i 
nonpromoted children but, on the final ina 
the promoted group mean of 15.5 pce 
nonpromoted group mean of 14.8. This s 6E 
Írom initia] to end mean scores favored € 
moted group. Was the difference between 
sroups a significant one? g 
Statistical Significance of total ad ustmen 
ferences revealed between the groups cou lon 
accounted for by chance alone, the A 
technique was used to determine the statis n 
Significance of these differences. Correla € 
initial and end tota] index scores for agents - 
Children produced a product-moment pc 
tion of , 574, Substituting this figure, toge 


t 
moted child was predicted. Attained fell P 
9f predicted scoreg by the mean difference was 
1.04. The standard error of this difference ^ 
«980: the mean difference divided hy 156 BUE, 
ard error produced ἃ critical ratio of 1.58. than 
Probability of attaining a difference greater ce, 
this in both directions is , 115. The μόνα 
favoring the Promoted group, can be gocer ce 
at only the 11, 5 percent level of confiden nce 
The null hypothesis that there are io differ 


Interpretation of Findings 


sop dif 
Inter retation of major findings. —Major tif 
ferences were see aor findings 


: d th? iy 
κ» © Woven. It will be recalle on 

Significant major &roup differences occurred 

9n Sociometric data. These were as follow 


IV that socionetric data for 52 Promoted and 51 nonpromoted ehilare” 


June, 1954) 


. 1. The promoted children were rejected sig- 
nificantly less by classmates as persons not de- 
Sired for very best friends. 

2. Nonpromoted children were more outward- 
Boing; that is, they tended to be bothaccepted 
and rejected as very best friends more than pro- 
moted children when acceptance and rejection 
Were considered simultaneously as one score. 

3. Promoted children increased, while non- 
Promoted children decreased, their bonds of mu- 
tual acceptance. 

4, Nonpromoted children tended to recipro- 
cate one another's choices to a greater degree 
at the beginning than at the end of the year. (This 
Shift was significant at only the 9 percent level, 
Uncorrected for correlation, but is included here 
4S an aid in understanding the general pattern. ) 


The whole picture of sociometric change 
Over the school year was one of decline in de - 
Sirable adjustment for the nonpromoted child- 
ren and of improvement for promoted children. 

€Se repeating youngsters were oldtimers com- 
Pared to the incoming first-graders with whom 

ey associated. They '*knew the ropes" and, 
Perhaps because of this fact, assumed positions 
of prominence at the outset. A good many of the 
se Weomers Sought out these older classmates 
ar friends; or, at least they expressed a desire 
"s associate with the repeaters. At the same 
aoe however, others among the incoming group 
Isliked many of the nonpromoted children; per- 
DS they had experienced or were experiencing 
lying from the latter. Meanwhile, nonpro- 
ene Children tended to seek out one another, 
ting up little cliques within the larger group. 
© fact that nonpromoted children concentrated 
νος best-friend choices among themselves in- 
ad of distributing them more proportionately 
‘Toughout the room is a potent factor to be con- 
ered in interpreting the large number of 
Dices received by this group at the beginning 
e year, 
te At the end of the year, the nonpromoted child- 
ὅποι Still maintained, on the whole, relatively 
thej, nent positions in their classrooms, but 
is ^ Prominence was founded on a different bas- 
as While they had not declined in unpopularity 
pom CaAsured by rejection, they had declined in 
i, arity as measured by acceptance. Nothing 
Child. data suggested that these nonpromoted — 
t dren Sought less desirable individuals for mu 
friends than they had at the beginning of the 
vel at any rate, their mutual acceptance bonas 
Ten. maintained with relatively popular chi 
9n both trials, : 
Te By the end of the year, the promoted child- 
ue 2d greatly strengthened their positions τ. 
the 9Cial structure of the classroom. Althoug! 
ihe’, Were less outwardgoing, as measured by 
Index, this change had come about by a re- 


GOODLAD 


321 


duction in rejection that exceeded an increase 

in acceptance. Accompanying the increase in 
social acceptance as measured by single bonds 
of choice, was an increase in the number of mu- 
tual bonds established. The whole comparative 
picture of change was one of increasing health- 
iness in social adjustment for the promoted 
group. 

Interpretation of items related to social ad- 
justment. —Differences in single items related 
to social adjustment were considered first, since 
all major differences identified were concen- 
trated in this area. 

The writer sought, through examination of 
the discriminating items, to discover, first, 
evidence to explain why the nonpromoted group 
declined in acceptance and why it maintaineda 
higher level of rejection than was maintained by 
the promoted group. The following evidence 
threw considerable light on the problem: 


1. The nonpromoted children appeared to be 
lacking in certain social skills requisite to am- 
icable group relationships. They declined, while 
the promoted group improved significantly, on 
the self-rated items, ‘‘Do you tell mean child- 
ren what you think of them?” and **Do many of 
the children start quarrels with you?" Ofcourse 
these questions are disguised to make it appear 
that the other children are at fault. Theanswers, 
however, revealan excess of involvement in 
friction-producing relationships with other child- 
ren, relationships tending to incur ill-will and 
subsequent rejection as friends. Related here, 
also, is the significant nonpromoted group de- 
cline on the item, ‘‘Do people try to argue with 
you a great deal?” There was, however, a 
slight decline for promoted children as well on 
this item. 

Appropriate to this discussion of socialskills, 
is the fact that nonpromoted children were rated 
by their teachers significantly more than pro- 
moted children for bullying. The possibility 
that first-graders experienced bullying from 
these older repeaters was put forward earlier 
as a factor to be considered in explaining the 
rejection of nonpromoted children. There ap- 
pears to be ample evidence to support the pos- 
sibility that inadequacy in the area of social 
skills contributed to less satisfactory social ad- 
justment on the part of the nonpromoted group. 

2. Another hypothesis is that undesirable so- 
cial standards or attitudes contributed to the de- 
clining social adjustment of the nonpromoted 
group. Supporting evidence is found in the fact 
that there was an increasing tendency for the 
promoted group and a decreasing tendency for 
the nonpromoted group to answer ‘‘no’’ to the 
question, ‘‘Should one be nicer to bright child- 
ren than to others?” In addition, while there 
was no change for the nonpromoted group onthe 


9 JOURNAL OF EXPERIMENTAL EDUCATION 
32 


i e erson always be niceto those 
p Marie ο) in games?" the promoted 
η answered *'yes" significantly more often 
eod final than on the initial trial, These dif- 
desque suggest a greater predispositionon the 
part of the promoted children to behave Socially 
in a more approved manner. Inadequacy of the 
nonpromoted group in the areas of Social skills 
and social standards most certainly would ex- 
plain why these children were rated more fre- 
quently by their teachers as being unpopular with 
children. 


3. There is evidence from test material pur- 


Social and self-adjust- 
ment to support the exploratory hypothesis that 


a. “Do many people you know Say mean 
things about you?" 

b. “Do you classmates often 
hurt your feelings?" 

6. “Do you think that 


Say things that 


: » Tesponses to the 
question, «Do you think oth i 
like you?’’, listed under 


ceptance. The teacher 
Supporting evidence: 
promoted children 
Social activities, 
Evidence here, all under the Self 
division of the California Test of pe 
conflicting. Differences On the folli 
favored the nonpromoted group: 
a. ‘Do the children ask you to Play with them 722 
b. “Do the other children like to have you 
with them?" 
Once again, these data indicate o 
children felt about themselves, 


"Adjustment 
TSonality, is 


Owing items 


nly how the 
Whether or not 


(Vol. XXII 


the situation suggested here was reciprocated 
by the other children is not known. Perhaps a 
responses to these questions are indicative HR 
of “wishful thinking, ” an expressed desire ig 
be wanted by the other children made more poig 
nant by the realization that their — 5 
said mean things about them, hurt their fee ing , 
and tried to avoid them, The possibility um 
Some wishful thinking may have been involve 4 
here is Strengthened by other δν icis 
differences related to outwardgoingness pore 
favored the promoted group. There wasa L 
er tendency for the promoted group to seg itm 
“no” to the following questions, answers ag ο 
are indicative of greater freedom from wi 
drawing tendencies: of 

à. “Do you like to stay away from many 

the children?” i ings 
b. ‘Would you rather think about nice thi 
than play?” - 

The bulk of the evidence supports the hypoth 
esis that the nonpromoted group engaged πὴ er 
actively in socia] activities, However, ot ts 
evidence has shown that their social contac haps 
provoked friction and incurred dislike. Ph 
it was feelings of personal dissatisfaction i 
this undesirable and disapproved socializatio da 
that stimulated the desires to «think about chil- 
things” and to “stay away from many of the 
dren." Much of the 


hdraw from peer-gr al 
Deriences, their value to the indivi 
t 


syst” 
tion of items related to self-ad τ 
Siem, IE Will be recalled that there were i 
Significant gi erences between the groups 10 
total Scores o 


: n any instrument or on any SU 
Section of instr 
persona] adjustme 
DO Suggestions or majo 
nifica 
It wa. 


m 
hea 
rated by their teachers for Cg 


» While the nonprome item 
" ow 

: nhappy because of getting 1 ined, 

in School?» the promoted group de 5 


arks e 
Significantly on this question, These data sog 


June, 1954) 


that many of these promoted youngsters, slow- 
learners during their first year of formalschool- 
ing, were taxed beyond their learning capacity 
by the work of the second grade. Unfortunate- 
ly, because of the gross overcrowding of many 
elementary classrooms, the rigidity of grading 
Policy, curriculum, and instruction, and the 
tendency for many teachers to think of them- 
Selves as teachers of given grades rather than 
of Children, our schools often give little more 
than lip-service to individualized instruction. 
To the degree that these and like conditions ex- 
isted in the schools of the present study, these 
Schools probably were unable to adapt their in- 
Structional programs to the needs of these slow- 
learning children. It is not surprising, then, 

lat some youngsters resorted to cheating as 
One means of conforming to expected levels of 
achievement. More disturbing is the fact that 
they were ‘unhappy because of getting low 
marks in school. ” 

2. Two other differences that seemed closely 
related to the preceding discussion favored the 
Donpromoted group. Both groups declinedfrom 
Initial to end trials on the items, ‘‘Do your par- 
ents criticize you a lot but seldom praise you?" 
and “Do you feel that no one loves you at home?" 
The decline for the nonpromoted group was only 
Slight but that for the promoted group was sig- 
nificant. The greater unhappiness over low 
Sailing and feelings of being unwanted at home 
or the promoted children appear to be related. 
t Practically all parents are concerned over 

39 School success of their children; the pres- 
ige of the entire family is involved. The com- 
Parative progress of these second-graders prob- 
t Y Was not up to parental expectations; reports 
9 the home may not have been of a sort toelicit 

Praise, slow progress and teacher-comments 
to the effect that “Henry will have to do better 
;. 1 is to make the grade, ’’ frequently go hand- 
ματ παρα, Parent and teacher exhortations to do 
etter when the capacity to do better was not 
€re seem well directed to the end of creating 
Clings of both unhappiness at school and re- 
‘ection at home. i 
he question arose as to why these tendencies 
were not so clearly developed in the nonpromot- 
8roup. It may have been because, after both 
Parent and child had experienced a failure, con^ 
“hued slow progress could be only anti-climatic. 
r, it may have been because the greater ease 
n achieving on a second trial mitigated against 
Tiction-producing situations and so reduced 
®nsions all around, for child, teacher, and 
arent, Then, too, an actual grade failure fre- 
quently precipitates teacher-parent conferences. 
Uch conferences often center around the theme 


tat “another year in the grade will be better for 


€nry in the longrun, ” a possibility that a con- 
Scientious Ps usually is willing to accept. 


fe 


GOODLAD 


323 


The parent even may readjust, secretly, his 
ambitions and content himself in the future with 
a much lower expectancy for Henry. 

3. A number of statistically significant items 
supposedly related to self adjustment but not 
clearly enough related to the preceding analysis 
or to one another to be woven into a meaningful 
pattern are discussed below. 

a) The nonpromoted group declined signifi- 
cantly and the promoted group declined 
only slightly on the item, ‘‘Is it hard to 
talk things over with your folks because 
they don’t understand?’’ This result ap- 
pears to contradict the earlier findings 
related to the home situation. Was the 
increased tendency for nonpromoted chil- 
dren to answer **yes" to this question the 
result of decreased parental interest in 
these children because of disappointment 
in them? Such an outcome seems to be. 
out of all proportion to the magnitude of 
the suggested stimulus. 

b) The promoted group improved significant- 
ly and the nonpromoted group declined 
markedly on the item, **Do you have many 
colds?’’, an item purporting to measure 
freedom from nervous symptoms. There 
was no additional evidence in this area up- 
on which to base interpretations. How- 
ever, there may be a connection here with 
the social adjustment evidence listed eari- 
jer to the effect that the nonpromoted chil- 
dren appeared to sense the disapproval of 
their peers. Affirmative responses to 
this particular item certainly could be con- 
nected with a basic sense of insecurity. 

c) The nonpromoted group gained significant- 
ly and the promoted group gained only 
slightly on the item, ‘‘Do the children 
think you are afraid of things?" These 
findings make sense in the light of prev- 
ious findings and the actual wording of the 
question. If the question had been worded 
«Are you afraid of things?’’, it might have 
constituted a rather obvious attempt to get 
at nervous symptoms; the question in this 
area on colds is much more subtle. It is 
to be expected that the repeaters wouldre- 
act as they did to this question. Inthe first 
place, they were older and thus probably 
more daring than their classmates. In the 
second place, realizing their possibilities 
for superiority in physical prowess as con- 
trasted with their demonstrated inferior- 
ity in mental activities, they could not af- 
ford to give the other children cause to 
think them afraid. This finding and the 
assumptions raised about it suggest a re- 
lationship to greater tendency towards bul- 
lying reported for the nonpromted group. 

d) The final significant difference between 


324 JOURNAL OF EXPERIMENTAL EDUCATION 


the groups was that the nonpromoted chilc- 
ren were rated by their teachers more 
frequently for stealing. Noné of those 
checked for this offense was rated in a 
category of frequency higher than **occas- 
ional occurrence, 7 Furthermore, no ey- 
idence was available concerning the nature 
of the transgressions themselves. Crim- 
inologists generally attribute a chronic 
disposition to thievery to the presence of 
a combination of several debilitating fac- 
tors. Many case studies Suggest both car- 
eers of theft and Single instances of theft 
as means of seeking a revenge against a 
discriminating or misunderstanding society, 
A possible interpretation of the greater in- 
cidence of Stealing for nonpromoted child- 
ren is suggested in the light of this theory 
and the less Satisfactory social adjustment 
of the nonpromoted group. 


The preceding discussion has servedto point 


SECTION vy 
CONCLUSIONS AND RECOMMENDATIONS 


Conclusions 
ο ος 


In the initial Section, it was Stated that the 
£ ώς hierarchy of Conclusions wag attempt- 
ed: 


1. Retention or rejection of the null hypoth- 


ere no significant differences 
PS in social and personal adjust- 


2. Evaluation of differences, i 


Significant to bermit rejection of the hypot 
to determine the relati e EN 


ed and nonpromoted groups, 
3. Analysis of any Significant findings and the 
experimental design of the Study to determine the 


contributions of the Promotion factor in determ= 
ining existing difference, 


Testing the hypotheses, —Twenty-nine T 
Stances of significant difference, some involy~ 
ing group differences on major Sections of an 
instrument and others involving group differ 
ences on major sections of an instrument d 
others involving only single items of instru- 


(Vol. XXII 


ments, were identified. These ranged in sta- 
tistical significance from better than the 1 per 
cent to about the 5 percent level of confidence. 
These data clearly reject the hypotheses and 
So they are re-worded below in the phrasing ap 
propriate to the findings: 


l. There were differences in social ue dl 
ment between repeating and non-repeating SC 
Children, iust- 

2. There were differences in personaladjus 


4 1 
ment between repeating and non-repeating schoo 
Children. 


Evaluating the differences, — The fact un 
the nonpromoted Children were more pedet 
going, according to Sociometric data, probably 
would have been io the advantage of this oer 
were it not for the fact that rejection contrib 
uted more than acceptance in producing this 
Situation, Then, declining mutual re oe 
bonds among members of the nonpromote t 
Sroup could have been a desirable απο 

d these children established other such o 
with classmates other than repeaters. This al- 
velopment appeared to be just part of a gener 
ly declining picture of mutual acceptance for 
nonpromoted children, gen” 

Major trends of difference supported the g% 
eral conclusion that differences in the area te 
relationships fayored the promo 
S Conclusion was substantiated d 
all three types of data collec Pm 
Jor trend suggested the Bones. 
moted children were more He helt 
chool progress and ely 
Concerns that appeared clos int- 
related to each other. It should be clearly PO n7 
€d out here that patterns of difference dise 
inate Consistently neither between groups τ dif- 
among individuals. The greater number of st 1 
ferences favoring the promoted group Was ere 
erbalanced, in Part, by certain significant ἆ a 
ferences favoring the nonpromoted group; um 
ceedingly high or low Scores on given items low 

ildren were compensated for by 
T high scores on other items. a 

Accounting for differences, —rt usually 15 
ar difficult problem gentes. It usually i 
of differences than to account for their ac m 

sent Study, a great many V4 
influence social and ώμο 
most potent τως s ecce ag 1 


group. Thi 
findings on 
Another ma 
that the pro 


Ae ;, STOUDS On these bases. In2 =f 

acm, the schools Chosen were neither pr? 

tenu rural nor Predominantly ur a e 
ers of the c ildren selected represe he 


Furthermore; 


June, 1954) 


9f the schools from which promoted children 
Were selected corresponded closely to thefirst- 
&rade enrollments of the schools from which non- 
Promoted children were selected. Other factors, 
Such as the educational philosophies and facili- 
ties of the Schools, teaching procedures, andso 
9n, were to operate by chance on the assump- 
tions that the schools in this county system do 
Dot differ widely and that any differences, par- 
ticularly in quality of teaching, would be distrib- 
uted at random fairly evenly between promoted 
and nonpromoted groups. The experimentally 
ifferentiated factor was that of promotion; the 
Promoted group was in its second year and sec- 
ond grade of Schooling while the nonpromoted 
8roup was in its second year and first grade of 
Schooling, 
To the extent that the influential factors left 
9 operate by chance actually did contribute to 
oth groups evenly, as assumed, and to the ex- 
tent that the learning environment of first and 
Second grades are generally and fundamentally 
alike, the differences identified can be attribu- 
€d to the differentiating effects of promotion 
or. Onpromotion. Obviously, then, the roles 
Promotion and nonpromotion can never be ac 
Counted for with absolute finality. Research 
Procedures for reducing the abnormal opera~ 
ton of chance factors are described in the next 
Section, 
dns Set of differences, more than all the 
ers, suggests grade placement as the major 
Causal factor, This is the group of differences 
revealed through sociometric ‘‘best-frie nd" 
ehniques, Promoted and nonpromoted child- 
en were compared first to their own class" 
Mates; the relative positions of promoted child- 
*en then were compared to the relative positions 
DOnpromoted children. This dual comparison 
revealed promoted children to be rejected rel- 
atively less than nonpromoted children. Whether 
Ὃν Dot first-grade educational environments dif- 
Cr fundamentally from second-grade environ- 
«nts matters little, if at all; the important 
thasideration is that nonpromoted c hildren 
rived less well than promoted children when 
Sach group was compared to its own class group. 
affi 'he total body of evidence suggests the closer 
ifiliation of undesirable social and personal ad 
"Stment characteristics with nonpromotion than 
òf Promotion. Although the exact causal nature 
this affiliation cannot be ascertained with fin 
ality, there are clear indications that nonpro~ 
Hotton is the less defensible educational prac 


Conclusions to the effect that some children 
XDerienced unsatisfactory social contacts, tha 
ome were worried over their school progress, : 
nat others were insecure in their home relation 
anes, that certain children were experiencing 

ifficulty in balancing success and failure, and 


GOODLAD 


325 


so forth, have serious implications for the de- 
velopment of the individual personality. The 
fact that such conditions were found to a great- 
er degree in one group than in the other has, of 
course, important educational implications. 
However, they existed to a degree (andthe data 
suggest to a greater degree, on the whole, than 
normally would be expected) among the children 
of both groups. Case studies of disturbed indi- 
viduals reported in the psychological literature 
abound with references to such inadequate per- 
sonality adjustments; therapy for maladjustment 
among mature persons is a long, frequently un- 
successful, process. A more hopeful solution 
would seem to be attention to sound principles 
and practices of mental health while the individ- 
ual still is in his early, formative years. If un- 
wholesome personal development can be detect- 
ed as early as the first school years, it can be 
treated and prevented also during these years. 
The job is not that of the school alone; it re- 
quires the cooperation of the home and all ag- 
encies devoted to the welfare of boys and girls 
if any significant reduction in the incidence of 
mental ill-health is to be accomplished. 


Recommendations 


For promotion and related administrative 
practices. — The findings of this study suggest 
the following recommendations related to the 
classification of pupils in graded schools: 


1. That each instance of proposed grade re- 
tention be critically examined. When an affirm- 
ative answer based on fact rather than opinion 
cannot conscientiously be given to the question, 
“Is nonpromotion more likely to facilitate the 
all-round development of this child? ”’, then that 
child should be promoted to the next grade. 

2. That teachers adopt a broad factual basis 
for the consideration of promotion and nonpro- 
motion. Facts related to achievement and intel- 
ligence are not sufficient; nor is the division in- 
to more categories of a limited body of inform- 
ation adequate. Desired are facts related toall 
phases of human growth, collected from a wide 
range of sources throughout the year rather than 
during the last few weeks of school, and anal- 
yzed in the light of sound principles of child 
growth and development. 

3. That schools or entire school systems ex- 
amine their promotional practices critically to 
make sure that they are resulting in the best 
possible development of boys and girls. Suchan 
evaluation should include a re-examinationand 
perhaps re-information of purposes, policies, 
and practices to make sure that they are philo- 
sophically and psychologically sound, consistent, 
and thoroughly understood by the entire per - 
sonnel. 


JOURNAL ΟΕ EXPERIMENTAL EDUCATION 
326 


i and instruction. —It will be 
ed oia differences favoredthe same 
Lira and that practically no individuals were 
onEistentiy well or poorly adjusted according 
to the evaluative techniques used. Because of 
these patterns of variability, the following rec- 
ommendations are proposed: 


1. That desirable concepts, habits, Skills, 
and attitudes be substituted for minimum essen- 
tials of subject matter in Courses of studies. 
Subject matter thus would 


Course of studies. 

2. That instructional 
85 the needs of the Child: 
The careful exercis 


procedures be as varied 


tions of how fast a 
Should progress or 
normal rate of pro 
than his own, 


; ΠΟ moreis 
Possible, no less is enough, 

For school or nization, —Assigning grade 
labels and promoting or faili 


grades one to three, in more than forty Milwau- 
kee schools. In these 


of the sixth semester but ke seven, 
eight, or even nine semester. 


S to travel from the 
first contact with school to the fourth grade. Of 


It provides for the practical application 
of present-day educational Philosophy, It 


(Vol. XXII 


x justment of teaching and au- 3 
a procedures easier to mest 
the individual differences of cindre: 
It allows the bright child to go dac Β 
yond the usual standard for his gr 3 
without removing him too far from ! 
social group. It encourages the ck 4 
child to the limit of his ability. Itre 
lieves strains and fears and pue ae 
promotes wholesome mental health (13). 


No evidence was given to support the a 
claims. However, since they are in — 
with the basic philosophy and major em ene 
of forward-looking school systems ο. 
the country, the plan deserves careful a τ 
It is recommended, therefore, that a - 
plans of school organization be implemen ire 
an experimental basis, with αμα T iige 
search into the growth and developmen! d hat 
ren involved. It is further recommende ae 
curriculum and instruction be adapted to dis 
form with the new plan of organization an 


i be 
underlying philosophy. The following would 
requisite here: 


mend” 

1. Courses of studies of the type recom 

ed earlier in this Section. elop- 
2. Understanding by teachers of the dev 

mental tasks faced by children epi ra οσα” 
3. Understanding by teachers of teac hildren 

cedures appropriate to assisting these C 


in the Satisfactory accomplishment of these 
tasks 


4. A wide r 
Suited to the 5 
all children r 

5. Are 
dividual 


jals 
ange of instructional mates , 
Towth and developmental n 
epresented, lin" 
Porting System designed to reve E 
Pupil growth and development. 


, he 
For further research. —Conduction of t 
present study ΤτΕ, 


her 
Sent study has revealed the need for furt 


-grad* 
1. Studies into the Constancy of primary P η 
friendshi 


Studies already have ας 
» Such as a home-par z en, 
y change the popularity of given n 
i ut long-term fluctua of 
cing them, the effects 
Toom social structure; 


e 


ably in guiding the q 
school Children, 


s, 26 
udies into the achievement, attitudes» g- 


2. Sti 
havior, and bersonal-socia] adjustment of C uc” 
ren embraced by 


88 
Plans of continual progre ive 
as the Milwauke Plan. Too often, progres ave 
nization and manageme 


June, 1954) 


become stagnant or discredited because of a 
failure to collect evidence that might have dis- 
closed existing weaknesses and provided for re- 
visions to conform with society’s ongoing 
changes. 

3. A more comprehensive investigation into 
current effects of promotion and nonpromotion 
based upon a slightly different experimental de- 
sign than was used in the present study. It is 
Suggested that groups be matched, case-for- 
case, previous to rather than after the incidence 
of promotion or nonpromotion. The children 
matched should be drawn entirely from a group 
of potential repeaters; one member of each 
pair then should be selected for promotion to 
the next grade and one for retention in the pres- 
ent grade, Previous to such promotion and re- 
tention, however, the first administration of 
all evaluation instruments should take place. 
These instruments should be designed to meas- 
ure achievement, intelligence, attitudes, behav- 
lor, and personal-social adjustment. Thus, an 
entire year would have elapsed from initial to 
end administration of the instruments. Develop- 
ìng anecdotal records, maintaining careful ob- 
Servations, and holding interviews during the 
intervening period would serve not only to sup- 
ply corroborating evidence but also to provide 
Insight into the dynamics back of any group dif- 
erences ultimately identified. Such a study 
Would facilitate more clear-cut interpretation 
of differences at the causal level, even though 
inal attribution of differences to the promotion 
actor still would not be irrevocably justified. 

This study suggests that repeating a grade 
iS detrimental to the social and personal devel- 
pment of boys and girls. The evidence pre- 
Sented, together with evidence from other stud- 
les that repetition is not conducive to greater 
efforts or achievement and that it is associated 
Vith undesirable school attitudes and behavior, 
Seriously questions nonpromotion as à valid ed- 
WCationa] practice. This study further revealed, 

OWever, that neither all the selected promoted 
ildren nor all the selected nonpromoted child- 
ia Were consistently well or poorly adjusted; 
here Was considerable overlapping between 
Sroups and among the individuals of any one 
Sroup. Such findings suggest that, whether slow- 
Progress children be regularly or irregularly . 
Tomoted, adequate subsequent provisionfor their 
needs is not being provided in our schools. Itis 
commended, therefore, that schools adopt, ex 
Perimentally, policies of regular progress "- 
PPropriate instructional techniques that η, 
5 individual child where he is and guide pd 
ard according to his own potential Ehe 
Capabilities. t is hoped that, as a result o » 
Su 3 omotion and non 
CCess of these procedures, pr ducation 
TOmotion would have no place in our educa htful 
al Vocabulary, just as they now merit no rig 


GOODLAD 


32 


place in forward-looking educationaltho ught 
and practice. 


BIBLIOGRAPHY 


m 


Akridge, Garth H. Pu il Progress Policies 
and Practices (New York: Bureau of Pub- 
lications, Teachers College, Columbia 
University, 1937), Pp. viii + 76. 

2. Anfinson, R. D. “School Progress and Pu- 

pil Adjustment, ’’ Elementary School Jour- 
nal, XLI (March 1941), 507-514. 

3. Arthur, Grace. “A Study of the Achieve- 
ment of Sixty Grade I Repeaters as Com- 
pared with That of Non-Repeaters of the 
Same Mental Age, ” Journal of Experi- 
mental Education, V (December 1936) 
203-205. š 

4. Bassett, Clara. ‘‘School Success, An Ele- 
ment in Mental Health, " Journal of Na- 
tional Education Association, XX (Janu- 
ary 1931), 15-16. 

5. Borgeson, F. C. “Causes of Failure and 

Poor School Work Given by Pupils, ” Ed- 


ucational Administration and Supervision, 
XVI (October 1930), 542-548. 


6. Burr, Marvin Y. A Study of Homogeneous 
Grouping (New York: Bureau of Publica- 
tions, Teachers College, Columbia Uni- 
versity, 1931), pp. ix « 69. 

7. Caswell, Hollis L. Education in the Elem- 
entary School (New York: American Book 
Co., 1942), pp. xiv + 321. 

8. Caswell, Hollis L. Non-Promotion in Ele- 
mentary Schools (Nashville: Division of 
Surveys and Field Studies, George Pea- 
body College for Teachers, 1933), pp. x 
+ 100. n 

9. Cheyney, W. Walker and Boyer, Philip A. 
A study reported in mimeographed form. 
Extracts quoted in Elementary School 
Journal, XXXIII (May 1933), 641-651. 

10. Cook, Walter W. Grouping and Promotion 


in the Elementary Schools (Minneapolis: 
ΠΤΙ of Minnesota Press, 1941), pp. 
dos Walter W. ‘‘Some Effects of the Main- 
tenance of High Standards of Promotion, "' 
Elementary School Journal, XLI (Febru- 
ary 1941), 430-437. | 
19. Elsbree, Willard S. Pupil Progress in the 
Elementary School (New York: Bureau of 
Publications, Teachers College, Colum- 
bia University, 1943), pp. viii + 86. 
13. Faith, Emil F. ‘‘Continuous Progress at the 
Primary Level, ’’ Phi Delta Kappan, XXX 
(May 1949) 356-359. . 
14. Farley, Eugene S. ‘‘The Influence of Grading 
and Promotion Policies Upon Pupil Devel- 
opment, ’’ National Elementary Principal 
XVI (July 1937), 268-214. 


(Vol. XXII 
JOURNAL OF EXPERIMENTAL EDUCATION 
328 


σα nz R ; ; —Qur 
a . Pupil Failure Er 
F Eug ἯΙ ing ters: 29. McGrath, G. D i ο. 
ui ad E i Failure | oa the Child, E Greatest Challenge and Opportunity, ''Pea 
zn Effects of Fai σ E ᾽ Ῥ 
pens Schools, XVIII (October 193 6), 


ον h 
body Journal of Education, XXVI (Marc 
1949), 290-294, 8 ils. ” Ed- 
: ild V Fails, " Ed 
31-39. « 30. Myers, V. C. “The Child Who B 
" gene S., and others. “Factors . y 7), 306-309. 
eh ------η of Pupils," ucation, LVI mre 108a 300 ae 
boue School Journal XXXIV (No- 31. Otto, Henry J., and Me y t of Failure 
Pier 1953) oq Tournal, tempt to Evaluate the Threa E ῥόον 
ie. το gh 2 “Charting Social Rela- as a Factor in πια, 1935), 588- 
da oc e Elementar School Journal, School Journal, XXXV (Apri ; 
5 iy 1946), 498-594 SL Journal 596 4 d 
1946), 498-504. . . teles a 
i8 HORS Ed dh c Tei All Pupils and the 32. Otto, Henyy J. Been Policies Zie- 
" xs iety May 19- Practices in Ele y Schools i 
44) μοι and Society, LIX ( " apolis: Educational Test Bureau, Inc., 
à A -— i ΜΗ 72. 
- Statistics in Psychology 1935), pp. xii+ 1 2. hool 
ao wou ene oe York: Longmans, 33. Perlman, M. B, Τ᾽. H 
Green and Co, 1947), pp. xii + 497. Grades for Better Child Deve (April 1945), 
20. Gilchrist, Edward p, «the Extent to Which Understanding the Child, XIV 
Praise and Reproof Affect a Pupil’s Work,” 40-42, is, Wal- 
School and Society, IV (December 1916), 34. Peters, Charles C., and Van nu Their 
872-874 ter R. Statistical Procedures í McGraw” 
21. Hurlock, Elizabeth B. “An Evaluation of Mathematical Bases (New York: ΧΙ + 
Certain Incentives Used in School Work, » Hill Book Co. ; Inc., 1940), pp. 
Journal of Educational Psychology, XVI 516. . is Too Costly for 
(March 1925), 145-159. 35. Robinson, B, B. “Failure Is acazine, X 
22. jablow, Lillian. “Deferred Promotions in the School Child, » Parents’ Magazine 
Grade One, ” Baltimore Bulletin of Educa- (January 1936), 22-23, 55-57. 
tion, XXV (December 1947), 147. 36 
23 Keliher, Alice 


ς d- 
δ. Sandin, Adolph, Social and Emotional A 
V. A Critical Study of Ho- iustments of Regularly Promoted and uo 
mogeneous Grouping in Elementary Schools Promoted Pupils (New York: Bor Columbia 
(New York: Alice V, Keliher, 1930), pp. Publications, Teachers College, 
37. 


University, 1944), pp. ix + 142. 
. Progress Through the 37. Saunders, 

Grades of Cit Schools 

of Puso- Schools 


il- 
Carleton Μ. Promotion or FT ow 
(New York: Bureau ure for the Elementary School Pu os 
of Publications, Teachers College, Col- York: Bureau of Publication, ors pp: 
umbia University, 1911), pp. 79. College, Columbia University, 1941), 

25. Klene, Vivian, and Branson, Ernest P. viii + 77, iled?"" 
“Irial Promotion Versus Failure, » Ed- 38. Stroud, 1, p, *'How Many Pupils are Fai 
ucational Research Bulletin, ΥΠ (January 
1929), 3-11. 


29. Li ndquist, E. F. A First Cours 
ties (Boston: Houghton Mifflin 
Dp. xi + 242, 


21. Lindsey, J, Armour, 


eb- 
Elementary School Journal, XLVI (F 
ruary 19 


-pro^ 
emplin R. S, «A Check-Up of Non-P. 
motions, ” Joy 


rnal of Education, C 
Novembe 19 
Annual and Semi- ( ij 


m 40), 259-200. -— Adjust" 
- Terr t . The School ΑΣΕ 

Annual Promotion with Special Reference αρ s T 

to the Elementar School J 

eau or p lary School 


e in Statis- 


39 
Co., 1942), 


he 
pent of Retarded Children, Unpublis ο 
(New York: Bur- Master's Thesis Dozen d of Psych j 
eau of Publications, Teachers College, ogy, University of Chicago, 1938, pP: 
Columbia University 1933), Pp. vii +170, 41 Viele, John A, “Does the No-Failure 9) 
28. McElwee, E, w, “A Comparison of Per- Work?” The Grade Teacher (June aie 

hae ῇ--- of 300 Accelerated, Norm- ΜΗ. 

al, and Retarded Children, » Journal of 42. Wi 

Educational Research XXVI (September o 

1932), pp. 31-34, 


soy and 
Ls AR, Children's Behavior ané 
Teachers’ Attitudes (New York: Com 
Wealth Fun 


d, 1928), pp. ix 4 247. 


A SUBSTRATA ANALYSIS OF SPELLING 
ABILITY FOR ELEMENTS OF AUDI- 
TORY IMAGES" 


JACK A. HOLMES** 
University of California 
Berkeley, California 


SECTION I 
THE PROBLEM AND ITS BACKGROUND 


A. Intzoduction 


1. The Problem and Hypothesis 


. THE PURPOSE of this study is to inves- 
tigate the auditory aspect of the general hypoth- 
esis; Spelling ability at the high school and uni- 
Versity levels is an integrated composite of a mul- 
tiplicity of sub-abilities which may be identified 
΄ in à general way as falling into the visual, aud- 
itory, and kinaesthetic areas. Individual differ- 
ences in spelling ability may be accounted for in 
Part by individual differences in the ability to 
handle phonetic associations; and since bothabil- 
Ἂν presumably depend in part upon the ability 

9 discriminate the various elements incomplex 
Sound patterns, the following specific hypothesis 
18 tenable: A cluster of musicalelements under- 

les individual differences in spelling ability and 

€ ability to make phonetic association. The 


abecific purposes of this experiment, therefore, 
re: 


1. To discover the cluster of musical elem- 

| ents which are related to individual differ- 
ences in spelling ability; 

2. To discover the relationships between 
Spelling ability, phonetic association, in- 
telligence, and the elements of auditory 
images; and 


3. To calculate the contributions to variance 
which each of the substrata musical elem- 
ents makes to spelling ability. 


2. The Literature and Rationale of the Study 


The visual image of a picture can be described 
in terms of line, form, color, texture, etc., and 
similarly, an auditory image of a word is pre- 
sumably composed of certain auditory and/or 
musical elements which are more funda m ental 
than the gestalt itself. The fundamental signif- 
icance of this study lies in its effort to pin down 
the specific elements which go to make up that 
elusive and generalized ‘‘auditory imagery’’ with 
which the literature on spelling abounds. Dolch 
(10), for instance, says, *'ear-spelling will al- 
ways be widely used. " Hartman (18,19) con- 
cludes that spelling ability is no more a function 
of general visual perception than it is of general 
auditory perception; however, he goes on to in- 
dicate that it is largely dependent upon one form 
of visual reaction—accuracy of immediate word- 
form perception—and dismisses the implications 
of the auditory elements without further ado. 
Likewise, Humphrey (26), Nolde (33), and Tidy- 
man (47) indicate that spelling ability develops 
around three basic types of imagery: auditory, 
visual, and kinaesthetic; but become no more 
specific than to name the auditory factor asa 
general term. 

McGovney (29) found that the most significant 
difference between good and poor spellers was 
the ability to give sounds for letters and sylla- 


"Financial assistance for this study was provided by the School of Education, 


Faculty Research Fund. 


“The author wishes to express appreciatio 
?nd Edith Wilcox, all of whom, as research 
8nd rigor inherent in the experimental design. 
Schools, Walnut Creek, Lecturer in 
thanks are due for valuable pro 
also extended to Professor Paul Morton, 

M equipment at the writer's disposal. 
California's Demonstration Secondary School, : 
You" for their help and cooperation in adminis 
€ The author wishes i st 

Ἐν in 
^ ly, D usi to Professor Harold D. 


Bestions, and encouragement. 


n to Doris Cal 


h assistants, T 
To Georgia Cooper, Director of the Georgia Cooper 


n in the Summer sessions, University of California, 
arding the Phonetic Association Tests. Thanks are 
f the University's Computer Laboratory, for placing 
To Mr. Robert E. Brownlee, Principal of the University of 
Oakland, California, and his staff goes my sincere "thank 
tering the paper and pencil tests to the high school 
ative attitude of the students in his classes who, as 


Special Educatio 


fessional advice Τι 
Director ο 


ting the tests or, H 
des Carter, friend and colleague, for his criticism, sug- 


University of California η 


dwell, Martin Kling, Donald Green, Pat Kutzner, 
went beyond the line of duty to ensure accuracy 


as undergraduates, helped by taking the tests. 


390 JOURNAL ΟΕ EXPERIMENTAL EDUCATION 


bles, and to perceive small differences inwords. 
Others (8, 9, 25) find that “training ina uditory- 
visual discrimination’’ helps improve spelling 
ability, but the terms used are so general that 

a teacher wishing to build content into a course 
which contained the fundamental elements as well 
as the generalized ‘‘training in hearing" would 
be at a loss to know where to begin or what to 
place in his course, 

In apparent opposition to the above, Gates et 
al., (14), found that the deaf were almost four 
years superior to normals in spelling when read- 
ing was held constant. However, the technique 
of holding reading constant, it must be remem- 
bered, may have resulted in an over-correction. 
Kiefer, etal., (27) found that while many poor 
spellers had defective vision, none had defective 
hearing. Kiefer’s data were not substantiated 
by Russell’s (38) who found no reliable differ- 
ence between good and poor spellers on the Betts’ 
Telebinocular. Russell (38), Spache (43, 44), 


Gates and Russell (15) have Stressed memory 
Íor sounds as one of the ‘inherent?’ difficulties 
in bringing about Spelling readiness, 
decided that nearly all he 


monotones, She Says tha. 


average or good Spellers, but her 
Her results 


pecific contri- 


ng ability: 
Rote Memory for Words 
and Letters 1.97% 
Cognition 39.20 
Visual Perception for Words 10. 07 
Not accounted for 48.76 
100. 00% 


(Vol. XXII 


Russell (38) did find a functional auditory factor 
in that on auditory discrimination of words of 
Similar sounds the normals did significantly 
better than the poor spellers. Schonell (40), 
Spache (42), Palmer (34), Horn (24), and Wat- 
Son (48) may be said to be in basic agreement 
with the finding that good Spellers excelthe poor 
Spellers in phonetics and ability to make audi- 
tory discriminations, but again these authori- 
ties seldom become more specitic in respect to 
the fundamental auditory elements involvedthan 
to use such phrases as the **most potent cause 
of disability in Spelling" is an ‘‘intense func- 
tional paraphemia, ” or, an inability to give 
the ‘rhythmic pattern of words, " g 

In summarizing the literature dealing with 
the relationship of the elements of music to 
phonetic- and Spelling-ability, one might sa y , 
that Russell's (38) study is the most represent 
ative in that within his Separate findings one may 
find the anomaly. On the one hand he finds no 
difference on an audiometer test for good a nd 
poor spellers, and on the other hand he found a 
“functional auditory factor” in that on auditory 
discrimination the normals did better than poor 
Spellers. The genera] concensus of opinion, 
based on experimental, clinical, and classroom 
experiences, appears to be that auditory dis- 
crimination and phonic skills are very impor Ti 
ed, ο spelling ability, but to the writer's knor. 
edge no investigator has attempted a systematic 
experiment to determine the fundamental ασ 
ents of the auditory images which must under a s 
individual differences in the ability to form au 
itory images, Preparatory to the spelling of 


parasi This study attempts such an investiga” 
ion. 


3. Importance of the Problem 


1i it can be shown that spelling ability is 818 
nificantly related to phonetre ability and that | 
both spelling ability and phonetic ability depen 
extent upon certain fundamenta 
enerally considered as particu á 
rly associated with musical ability, then S€V 


s 95 become apparent: From 2 - 
theoretical Point of view, it should give the ed 


l Spell. Secondly, the el 

dentified could be further νο 

: Whether or not they were heredi 
Ound talents or Whether they could be trained. 


Peer Since Ὃ Ψ 
? “"SSell (39), and Holmes E 
ἵν all found that Spelling and reading are ΤΕ 
ed, the schools may be overlooking the veTY 


ο  ο.''.........µωµ. ` 
—À iee 


l. The modifications of t 


June, 1954) 


Students who are most in need of music training 
as à concomitant learning basic to such academ- 
16 skills as spelling and reading. In short, only 
When the schools know what musical elements 
are important in the formations of auditory-word 
Images can they hope to understand the dynamics 
. learning such images. Meaningful reme dial 
eut BE of spelling must depend upon meaning- 
ful diagnosis of spelling weaknesses, and mean- 
ως, diagnosis must in its turn depend upon 
nowing and assessing the visual, auditory, and 
i rede ghar elements which form the basis ofin- 
vidual differences in spelling ability. 


SECTION II 
SOURCES OF DATA 


A. The Populations Sampled 


1. The High School Grou 


i A random sample of 227 University of Cali- 
9rnia Demonstration Secondary School students, 

en the ages of fifteen and eighteen years 

Thi usive, constitute the High School Group. 

in iS sample was drawn from those students tak- 

3 8 the English and Social Science classes in the 
Ummer of 1951. 


2. The First University Group 


-T A random sample of 91 University of Califor- 

al) Students enrolled in the writer's Education- 
` Psychology class, Spring 1951, make up the 
Wst University Group. 


3. The Second University Grou 


i A random sample of 102 University of Cali- 
Ornia students enrolled in the writer ’s Educa- 
ional Psychology class, Fall 1951, make up the 
cond University Group. 
f The students in both university groups are, 
9r the most part, juniors and seniors and are 
homogeneous to the extent that their major inter- 
St is in the field of education. 


B. The Test Batteries 


1. The Battery Used with the High School Grou 


The criterion, spelling ability, was derived 
from Scores on a 5-choice multiple-choice spell- 
ing test. The 35-item test was made by using 


bilities to where the subtests were use 


HOLMES 331 


tables of random numbers in order to distribute 
the correct choices through the list. The words 
used were taken from Groves’ (17) study and 
therefore were of known difficulty. The words 
were placed in order of increasing difficulty. 
The particular merit of the wrong choices was 
that they were the high frequency misspell- 
ings reported in the above mentioned study. 

The independent variables were assessed on 
a phonetic association test and on a modification 
of the Kwalwasser-Dykema Music Test 1 (23). 
The High School Phonetic Association Test, Form 
A, is composed of 100 items selected in such a 
way as to sample most of the phonetic elements 
presented in a standard elementary course in 
phonetics. In this test the student is given a 
minimum number of cue-letters for each word. 
The key phonetic element being tested in each 
item is either omitted or substituted for, sothat 
in order to recognize the word the student must 
sound the strangely spelled word out to himself 
and can identify it only by its sound. For in- 
stance, ‘‘mlk, ” “138, " and ‘‘sertn’’ are some 
of the more obvious examples. Errors in spell- 
ing were not deducted for in this test. 

Table I lists the tests used in this battery 


along with their reliabilities. 


2. The Battery Used with the First University 


Group 


Two criteria for spelling ability were used 
in the First University Group. In the first cri- 
terion 79 words selected from various spelling 
scales, such as those given by Ayer (3), and 
Buckingham and Dolch (4), and from such stand- 
ard free association lists as those given by Ros- 
anoff (37) and Rapaport (36) were used. The list 
was administered orally and the students were 
instructed to write the words in one of three col- 
umns depending upon how they felt toward the 
word. The test was called a Word Pleasantness 
Test and the emphasis was placed on putting the 
word in the (P) Pleasant, (I) Indifferent, or (U) 
Unpleasant column rather than on trying tospell 
the word correctly. This technique was calcu- 
lated to reduce cheating, carry over, and also 
to sample the student's spelling ability under 
conditions which more normally approximated 
the task of spelling a word which comes to mind 
while writing. This test wasgiveninthe first 
month of the semester. 

The second criterion for spelling ability 
used with this group was a 5-choice multiple- 
choice spelling test of 50 items. The words 
were selected from missed items in the above 
Word Pleasantness Test. The wrong choices 


- Test made by Holmes (23) were of such a nature as to raise the relia- 
i D ful as research and diagnostic instruments, 


332 


JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE I 


THE TESTS AND THEIR RELIABILITIES* 
(The High School Group) 


ων οκ, 
ORENSE] 


r N 
Criterion 
1. High School Spelling Test, Form A, M:C 
by J. A. Holmes (Mimeographed) 


πμ. esa saison . 90 267 
Independent Variables 
2. High School Phonetic Association Test, Form A 
by J. A. Holmes (Mimeographed) Sin nid. A AONE .98 275 
3. Kwalwasser-Dykema Music Test as Modified by 
Holmes, κά i 
a. Tonal e ἄνω E s 18 237 
Bon ο μα . 70 237 
C. Intensity rr cta manu .79 237 
d. Tonal eric MAP μμ .88 237 
6, Time s coro μμ. 50 237 
Í. Rhythm Discrimination D PM ΜΗΝ : 71 237 
g. Pitch ΤΗ RE. νήμα, . E α 72 231 
h. Melodic ierit re E σώος .43 237 
Muscle MN : 


ined by the odd-eve 
Prophecy Formula 


Corrected by the S 


n split-half technique and 
**Cf. Holmes (23). : 


(Vol. XXII 


^ " 


- Tet 


June, 1954) 


Were made from high frequency misses as tal- 
lied from the responses given by the university 
Students earlier inthe term. The items in- 
creased in difficulty and the correct choices 
Were distributed from item to item as dictated 
by the tables of random numbers. This test was 
Biven on the last day of the semester. 
" The independent variables consisted of the 
Q” and “L” scores of the American Council 
9n Education Test, 1949, and the elements of 
auditory images were assessed on the Seashore 
Measure of Musical Talents Test (41). Table II 
Presents the tests and their reliabilities. 


3. The Battery Used with the Second University 
Group DIXON 


The criterion for Spelling ability used in the 

“cond University Group was the same multiple- 
Choice spelling test used with the First Univer- 
Sity Group, 
te The independent variables consisted of the 
M ' and “L” scores of the ACE, 1948, and the 
iiti B of the Phonetic Association Test used 
t the High SchoolGroup. Table II presents 


€ tests and their reliabilities. 


SECTION III 


STATISTICAL PROCEDURES, ANALYSIS AND 
RESULTS 


of Tables IO, V, and ΥΠ present the matrices 
Intercorrelations, the means, and standard 

“Viations of the variables for each of the sam- 

νὰ. All correlations were of the product- 

of Lent type. In order to be sure that the factor 

the SS Could be held constant, it was included in 

G © intercorrelation matrix of the High School 
toup and was also calculated for the First Un- 


lversity Group. 
A. The Substrata Analysis 


t de intercorrelation matrix of each of the 
of © Samples was submitted to the same type 
Statistical analysis. The statistical technique 
(13) Was an extension of the Wherry-Doolittle 
(29 test selection method as developed by Holmes 
Subs This extension makes it applicable to a 
Suc irata analysis of the factors which underlie 
Do CESS ina given criterion. While the Wherry- 
Polittle method gives the test factors (which 
ma erlie predicted success) in the first order of 
lo nitude, the extension of the method will al- 
€ extraction of the test factors in the high- 
Or PTders, The test factors in the successive 
fers were extracted until a point of diminishing 


urns was reached. 


HOLMES 


333 


B. The Substrata Analysis of the High School 
Group 


Table III presents the 55 intercorrelations, 
means, and standard deviations for the 227 high 
School students. Several things in this table 
are of interest in the light of the hypothesis be- 
ing investigated. 

Spelling ability as assessed on a multiple- 
choice type of examination which demands rec- 
ognition rather than absolute recall shows a.726 
correlation with the phonetic association test. 
Age within the limits investigated shows no sig- 
nificant relationship with the ability to spell. 

The zero-order correlations of the modified 
form of the Kwalwasser-Dykema Music Test 
(23) with spelling attest to the fact that five of 
the eight subtests are significant at the one per- 
cent level of confidence. Απ additional two more 
are significant at the five percent level. Tonal 
Movement has the highest zero-order correla- 
tion with spelling; the r is . 429. Rhythm, Pitch, 
and Tonal Memory follow with r's which run 
from .338 to . 300. 


1. First Order Factors Underlying Spelling 
Ability 


The matrix, Table III, was submitted to the 
Wherry-Doolittle test selection methodin order 
to solve the multiple correlation problem: Of the 
variables sampled, which constitute the primary 
test factors underlying the ability to spellatthe 
high school level? Table IV presents the pert- 
inent data as they developed in the successive 
primary solutions of the shrinkage formula 

i-K?(N-1/n-m 
inwhich R is the ‘‘shrunken multiple correlation 
coefficient, the coefficient from which chance 
error has been removed. ” (13) 

From Table IV it may be noted that, of the 
seven musical elements which showed a signifi- 
cant zero-order relation to spelling ability, only 
one, Tonal Movement, is left to make an inde- 
pendent contribution to the criterionafter the 
variable Phonetic Association has been taken in- 
to account. From column ‘‘g”’ it is evident that 
spelling ability may be predicted with an R of 
.733 when Tonal Movement is added to Phonetic 
Association. Since R? may be expressed in 
terms of the beta coefficients for the zero-order 
r's, the proper substitutions in the generalform- 
ula, 


Rea, 7) = Bg (reg) + Bq (rej) 


yields the R? which is the contribution to vari- 
ance that this team of tests makes to Spelling 


JOURNAL ΟΕ EXPERIMENTAL EDUCATION 


TABLE II 


THE TESTS AND THEIR RELIABILITIES* 
(The University Group) 


(Vol. XXI 


Tests 


T 
First Group 


Criteria 


1. Word Pleasantness Scale for College Students: 
Form 3, Spelling-W:P by J. A. Holmes 


ORR iB RR dier .89 
2. College Level Spelling Test: Form A, M:C 
by J. A. Holmes (Mimeographed),,,,.,/.. |... iun -16 
Independent Variables 
3. “Q” Scale of ACE, 1949............... σημα .94 
ων ους sion uso tsar. . 96 
5. Seashore Test of Musical Talents: Full Scale** 
a. Pitch Discrimination ., , VIS 4A" ο TT .86 
b. Intensity coe: c DEMNM phia. .81 
ο. Time Discrimination , . RON SE RRC Ki ane S ο E «18 
d. Consonance μμ αφ να, .65*f 
e. Tonal Memory..,... EDSON TI σσ. i .89 
ο μνημών ο ο r .67 
Second Group 
Criterion 
1. College Level Spelling Test: Form A, M:C 
by J. A. Holmes (Mimeographed) LLLI NN «16 
Independent Variables 
2. “Q” Scale of ως NE 91 
3. rm» Scale of ων, νη 95 
4. Phonetic Association Test: Form B 
by J. A. Holmes (Mimeographed) T ΠΗ 93 


* All reliabilities are of the 
group, unless stated other 

**Cf, Seashore (41). 

*fReliability for co 


nsonance from M 
the Seashore Tes 


CCarthy's st 
t are from Whitl Y's study (28). 


€y's study (50). 


N 


104 


198 


102 


102 


160 
160 
160 

58 
160 
160 


198 


102 


102 


102 


corrected split-half type as derived by the writer on the 
Wise. 


adults 
í 


Al other reliabilities for 


e us 0 Ne 


m 


σ 


*Correlations must be . 18 to be significant at the one percent level. 


Ímoncouugo 


INTERCORRELATIONS, MEANS, AND STANDARD DEVIATIONS FOR SPELLING ABILITY AND 
THE INDEPENDENT VARIABLES TESTED IN THE HIGH SCHOOL GROUP (N = 227)* 


TABLE III 


Variables 1 2 3 
. Spelling-M:C Rees «13 -. 09 
. Phonetics -.06 
. Age 


. Tonal Memory 
. Quality Disc. 
. Intensity Disc. 


Tonal Movement 
Time Disc. 
Rhythm Disc. 
Pitch Disc. 


. Melodic Taste 


Mean 24.8 53.0 16.4 
Standard Dev. 6.1 : 


1.5 
5.7 


27.5 
5.5 


IBM and hand calculations carried to six significant figures. 


8 10 11 
«15 «84 .33 .15 
.18 .39 . 43 .24 
-. 08 .12 7,14 "UL 
«21 «54 «48 «24 
.18 .35 18 .21 
.23 .35 .22 . 02 
.21 .51 .50 .30 
.29 .32 m 
.55 .26 
.29 

2.5 VO 8.1 1.4 

4.3 .9 7.1 3.2 


(5461 ‘omg 


SAWTOH 


Geg 


JOURNAL ΟΕ EXPERIMENTAL EDUCATION 
336 


ili : igh schoollevel. Further, 
pel. nie of the equation gives 
er estsibulióm to variance which each test 

riterion. 

.------ equation reveals that, of all 
the c ο. which go to make for individual 
differences in the high school student's ability 
to spell, 54 percent of the variance can be at- 
tributed to differences in their ability to handle 
phonetic associations and to discriminate and 
anticipate good tonal movements. A breakdown 
of the contributions which the independent vari- 
ables made to Spelling-M:C2 Ability are: 


Phonetic Association 48. 4% 
Tonal Movement 5. 
Unaccounted for 46.1 


That part of the variance 


Which remains unac- 
counted for is substa. 


ntial and must be Sought in 
other factors, One would expect this since the 
relation of the visual perception of word-forms 
to spelling (Phelan, 35), the kinaesthetic asso- 
ciations of writing to spelling (Fernald, 12), and 
the relationship of intelligence to Spelling were 
n the present experiment. The sig- 


'r ability in phonetic 
associations had been partialed out, 


2. Second Order Fa 


I ctors Underlying S ellin 
^ Abilty der lying Spelling 


; it is eyi- 
ciation, car- 
» itis important, 
ractical purpose 
ics, to ask: What 
ptitudes basic to 


The substrata analysis on this Secondary ley- 
el reveals that the correlation between Phonetic 
Association and Tonal Movement is . 454, and 
that the multiple correlation is increased to .505 
when Pitch is next precipitateg Out of the matrix, 


(Vol. XXII 


The multiple rises to . 514 when Intensity, ο, 
final test to come out, is precipitated. WwW ^ 
the zero-order correlations are pope aae A 
their proper beta weights, it can be shown E. 
the musical components fane οα n 
al in phonetic associations, with their indep 

ent contributions to variance, are: 


Tonal Movement 13. 605 

Pitch 10. : 

Intensity 2. 
Total 27.1% 


Reflection on the factors precipitated ο 
primary and Secondary analysis will apnd their 
Substrata organization of the variables an ως 
contributions to spelling which can be repr 
ed as shown on the next page. der 

As for the percents given in the a on 
of the schemata: If one takes 13.6 ', ο" 
48. 4 percent, it becomes apparent that to the 
Movement contributes about 6. 6 percent jation. 
spelling variance through phonetic el con- 
Therefore, Tonal Movement makes a to dis 
tribution to Spelling ability of about 12. ου” 
cent. Pitch contributes 5, 1 percent, an these 
Sity another 1.4 percent. In other ee y de- 
three elements of music make a μπραβο, wil 
tectable contribution to Spelling-M:C buco of 
account for 18. 6 percent or nearly one-f hy- 
the total variance. Statistically then, the 


. hool 
pothesis is well Substantiated at the high 56 
level, 


3. Third Order Factors Underlying Spelli 
Ability 


Is Tonal Movement a more complex, eim 
fundamental factor than most of the other al au” 
ents in the revised K-D Test? The music ec- 
thorities the writer has sought in this conn usi” 
tion seem to think this is the case. If on a om" 
cal and/or PSychological grounds a higher le; 
plexity can be assumed, then it is justifi ajor 
though not necessary to the proof of the ™ 
hypothesis under investigation, to ask: lve 
are the more basic musical elements invo gan d 
in this more Complex element Tonal Move ea 
and what part of the variance must we leav 
a residual Contribution to the element its tions 

With full realization of the above assume ty 
that Tonal Movement is of a higher comple? 
than some of the other musical elements, Ton" 
tertiary Substrata analysis was made. — ] con^ 
al Movement makes not only a substantia an^ 
tribution to Spelling ability in the primary on 
alysis but also makes a substantial contr? 


2. Spelling-M: 


C is used to indicate the multiple-choice spelling test criterion, 


June, 1954) 


Tonal Movement 


(5. 5%) 


Spelling-M:C 
(100%) 


HOLMES 337 


Tonal Movement 
(8. 6%) 


Phonetic Assoc. 
(48.4% —13.1%) 


Unaccounted for 
(46. 1%) 


to spelling through phonetic associations in the 
Secondary analysis, then what are the more 
fundamental musical elements which underlie 
it? Pursuing the statistical methods outlined 
above, it is easy to show that about 40 percent 
of the variance which goes to make for individ- 
ual differences in Tonal Movement can actually 
be attributed to Tonal Memory, Rhythm, Pitch, 
and Melodic Taste. The multiple R sequence 
Starts at .522 and increases .595, .618, to fin- 
ally stop at . 631. When the beta weights are 
Combined with the zero-order r's of these ele- 
ments, the following contributions to the vari- 
ance of Tonal Movement are: 


TonalMemory (14.2%) 


Pitch (11.4%) 

TonalMove. Rhythm (11.0%) 
(100%) Melodic Taste ( 3.2%) 
Residual (60.2%) 


The breakdown contributions which these elem- 
€nts make through Tonal Movement direct to 
Spelling and through Phonetic Association to 
Spelling are important and therefore have been 
Calculated, but to save space they are present- 
ed only in the final Summary of the Study, Sec- 
tion ΤΥ. 


4. Summary of the Statistical Analysis on the 
High School Group 


The ability to make phonetic associations per 
Se apparently accounts for nearly half of the var- 
lance making for individual difference in Spelling~ 
M:c Ability at the high school level. TonalMove~ 
ment, Pitch, Intensity, Rhythm, and Melodic 

aste appear to be the most important musical 
Elements (of those tested) functioning directly and 
indirectly through phonetics to make for individ- 
ual differences in the ability to spell at the high 


Pitch 
(5. Ιώ) 


Intensity 
(i. 4%) 


school level. Approximately 20 percent of the 
variance in spelling can be accounted for by 
these elements. 

Although the analysis did not deal with the 
relationship of Spelling-M:C to the total Holmes 
(23) modified K-D score, it is interesting to 
note that the correlation between these two fac- 
tors was as high as .401. Between phonetic as- 
sociations and the total (23) modified K-D score, 
the r was .475. This is important corroborat- 
ing evidence since the reliabilities of these three 
tests are all . 90 or higher. 


C. The Substrata Analysis of the First Uni- 
versity Group 


Table V presents the 45 intercorrelations, 
means, and standard deviations for the ten var- 
iables tested. The First University Group con- 
sisted of 91 juniors and seniors taking Educa- 
tionalPsychology. Several interesting obser- 
vations may be made direct from the table. 

Spelling ability, when assessedona dictation- 
write-in type of spelling test, correlates to the 
extent of . 823 with spelling ability assessed on 
a 5-choice multiple-choice type of spellingtest 
when the difficulty of the words used is equiva- 
lent. Further, as far as the statistical signifi- 
cance is concerned, both the “Q” and the “L” 
of the Psychological Examination for intelligence 
are almost identically correlated with both the 
W:P and M:C spelling scores. Correlations of 
.221 and . 210 express the relationship of the 
««Q’’-scores to both types of spelling scores, 
and the r’s of . 528 and . 534 express their re- 
lationship to the “L” scores. While the “Q” 
relationship to spelling does not come up to 
the level of significance ordinarily admissible, 
the “L” relationship to spelling is highly sig- 
nificant —much better than the one percent level 
of contidence, 


JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXII 


338 


'"S9Jn3ty JULIUS xis ο) poliieo SUOUu*w ye» pueu oue Wa 
Ἴ9λο] qu9219d suo ou ye yrs 


WHUSIS ος ο) |" IQ INU SMOTREIIOD y 


G^ ο) G'v ΤΌΤ o9 POT ΜΝ! 601 1 at “AS prepuryg ο 
866 ELE 25 e 61, 0'68 6'11, L'08 σ SP 61v G'99 ο “IN 
-— unnAuM “OT 
ος” i AIOW ADIN Deuo] 6 
og" 9g" ΘΟΠΈΠΟΞΠΙΟΩ '8 
B2" Ez" 62° 9Ulrp ᾿} 
Zr ἂν Go" er" 11) Ajisuajuy ο 
Gp 06 * ec" 9c e£ Yd “ο 
II Ic 9r' EI^ νε 15 (6761 πον) T», Ῥ 
LT eT GO ' Bt" Gg" ος" 09° (6761 ΠΟΥ) D, 'g 
Τ0 ϱΤ ᾿ eI 01᾽ 02° PL" ec" IZ’ Ο:1Ι- θαῃ]θᾶς 'z 


d:A-Suredg τ 


6r" [4 Ig’ 9r" £6' 66' eG" 66. 68" uoa 


«(I6 = N) anouo AJLISHMAINO LSU AHL NI ΩΠΙΘΠΙ, SH'IHVINVA ΙΝΠΩΝΠΩ͂ΠΩΝΙ AHL 
ανν ALIIIHV DNIT'I3dS YO SNOLLVIAWG GQHVGNV.LS ανν 'SNVATAI '"SNOLL V'IHHHOOMHSLLNI 


A S'ISVIL 
'9A0]A ΤΈΠΟΙ, ££," TILES" 6697 ' PPOO'I 6095 ` 9610 ' ζ΄ 
soneuoud 924 ' G9ZS° GELP,” 0000 `T GELF’ S926 ' a 
(LZZ = N) 0000 ‘I ‘0 

W- N μα] “<5 

ul 


159,1, 1 Ed gat T-N zA 
HO-.LSHIJ O:IW-DNIT'I3dS YOA 


1ΠΟΒΘ 'IOOHOS HDIH AHL YOA SHOLOV H3QHO- 
SA ATGILTAW Q3,D3HHOO A'IHAISSHOOQS NO Sd3LLS 'TVNIA ONLLVIDOTVO HOd LAHS 3HOA 


AI N'IS VIL 


^ — κ 
> 


June, i954) 


Reference to Table V in regard to the rela- 
tionship between spelling and the elements of 
music as assessed by the Seashore Measures of 
Musical Talents (41) reveals that Tonal Memory 
and Pitch give correlations Significantat the one 
bercent level. These relationships do not hold 
When Spelling is assessed on the multiple-choice 
type of test at the university level. However, it 
1S Significant to note that while both Intensity 
and Consonance or harmony have r’s with spell- 
ing-W:P which stand at the 5 percent level, only 
Intensity holds when the M:C-type of spelling 
test is used as the criterion. 

Part of the explanation for the fact that lower 
Correlations were found between spelling and 
the elements of music in this group than in the 
high school group may be sought in the higher 
Selectivity and reduced variability of the univer- 
Sity group. However, two other explanations 
also tenable are, (a) that the university M:C - 
Spelling test had a lower reliability than the high 
School M:C-test even though it was longer by fif- 
teen items, and (b) the distribution was slightly 
Skewed. The fact is, the M:C-spelling test at 
the university level did not have sufficient top 
to Array the better spellers. Since this was not 
the case with the W:P-Spelling test, and since 

© Words missed on the W:P-test werethe very 
Ones used in the construction of the M:C-test a 
it follows that the M:C-Spelling test was easier. 
his could be because there was carry -over 
from the W:P to M:C testings, although several 
months intervened, because recognition within 
Structured limits is easier than recall, and/or 
~©Cause more of the substrata elements of aud- 
ttory imagery must be called upon to cue word- 
Spelling discriminations in the W:P-Spelling 
*St than is necessary to discriminate word- 
Spellings where visual-multiple-choice images 
Are the dominant cues offered. The statistics 
about to be offered tend to support the last ex- 
Planations, although it is fully recognized that 
© others are involved. 

Although the data were not included in Table 

' it is pertinent to the reader's insight before 
Boing On to the analysis to know that the correl- 
tion between Spelling W:P and Total Score on 
Pis ACE intelligence test was . 452, between 

Pelling M:C and the Total ACE the r was also 
: Both of the correlations are identical and 
both are lower than the relationships which the 

O Spelling tests show to the ‘‘L’’-linguistic or 
Verbal part of the ACE intelligence test. — 

"Urther, while Age relationships are not giv- 
en in the matrix in Table V, the calculations 
Were made. Spelling-W:P correlated with age 
to the extent of -. 038; with Spelling M:C the r 

àS +.053. Likewise, the plus and minus cor- 
relations between age and the various elements 
of music measured did not differ significantly 
ος Om Zero. All these age r's were of the order 

sti 04 


— . 


HOLMES | 339 . 


1. First Order Factors Underlying Spelling- 
W:P Ability 


Following the procedure described in the sec- 
tion dealing with the Statistics for handling the 
High School Group the matrix, Table V, was 
Submitted to a substrata analysis. 

The first question is: Which of the variables 
sampled constitute the primary test factors un- 
derlying the ability to spell when the criterion 
is the number of words correctly spelled on a 
dictation-write-in type of Spelling-W:P test? 
Table VI presents the data as they develop in 
the successive primary solutions of the Shrink- 
age formula (13). 

From Table VI it is evident that the Spelling- 
M:C factor precipitated from the matrix first. 
One would expect this as experience makes it 
very clear that the visual-word-form-image is 
the dominant cue-discriminating factor used by 
most students in spelling words. The Spelling- 
M:C test was loaded with visual cues. What is 
important, however, as far as the hypothesis 
being investigated is concerned (i.e., that func- 
tionally some of the elements of music also play 
a Significant role in spelling ability), is the fact 
that Tonal Memory and Pitch raise the multiple 
correlation from .823 to . 847. Apparently, 
these elements of music make a contribution to 
the dictation-write-in type of spelling ability 
which is over and above the aggregate of abili- 
ties which function to make for individual differ- 
ences in Spelling-M:C at the university level 
(when Spelling-M:C is assessed on a test where 
success is dependent predominantly on discrim- 
ination by visual-word-form cues) When the 
zero-order correlations are multiplied by the 
proper beta weights it follows that the first- 
order test factors which underlie the criterion, 
dictation-write-in type of Spelling-W:P test, 
with their independent contributions to variance 
are: 


Spelling-M:C 64, 3% 
Tonal Memory 5.0 
Pitch 3.1 
Unaccounted for 21.6 


The unaccounted for variance of 27.6 percent 
is appreciable and perhaps can be explained in 
that part of the test scores which represent the 
unreliability of each test, It is also tenable to 
suppose that had other factors been tested more 
of the variance would have been accounted for, 
Α case in point where evidence seems ample 
would be such a variable as Tonal Movement, 
Recall that of all the musical elements used on 
the High School Group, Tonal Movement made 
the greatest contribution to spelling. It is re- 
grettable for our purpose that this element was 
not included in the Seashore battery, 


340 


WORK SHEET FOR CALCULATING FINAL STEPS ON 
FOR SPELLING-W:P FIRST ORDER FACTOR 


3. 


8 


Vol. XXI 
JOURNAL OF EXPERIMENTAL EDUCATION ( 


TABLE VI 


Zu 
SUCCESSIVELY CORRECTED MULTIPLE R 
S FOR THE FIRST UNIVERSITY GROU. 


R Test 
N -m 
3228 1. 0000 3228 6772 . 823 Spelling-M:C 
. 2849 1. 0112 . 2881 «1119 .844 Tonal Memory 
2762 1.0227 2825 


2 3 4 
1. Spelling-M:C ig 


.19 .51 .64 
2. “Ὁ” (ACE, 1948) d di 
3. “L” (ACE, 1948) " 


4. Phonetic Association 


09. 46,7 
9 Standard Dey, 2.3 a η 2 50. 


1 
5 9.6 
*Correlations must be , 25 to be signifi 
IBM and hand € Significant at 


the one percent level, 
Calculations Carried to six Significant figures, 


Tune, 1954) 


2. Second-Order Factors Underlying 
Spelling-W:P Ability 


. From the primary analysis just reported it 
1s evident that the variable Spelling-M:C carries 
à heavy loading and therefore it is important to 
ask; What are the skills, abilities, and aptitudes 
basic to the ability to spell when spelling is as- 
Sessed on a visual discrimination test? More 
Specific to the hypothesis is the question: What 
part of the Spelling-M:C score can be accounted 
for by the “Ὁ” and “L” factors of intelligence 
and is their variance specific to some of the el- 
ements of music? 

: To answer this question, a substrata-analy- 
SiS was made by deleting Spelling-W:P and its 
intercorrelations from the matrix in Table V and 
allowing Spelling-M:C to become the new sub- 
Criterion for a secondary multiple correlation 
analysis on the variables which were left. 

The substrata-analysis on the secondary lev- 
9l reveals that the correlation between Spe lling- 
Μ.Ο and “L’’-score of the ACE intelligence test 
18.935. The multiple correlation between Spel- 
ling-M:C and the ‘‘L’’-score plus the ‘‘Q’’-score 
rises to .546. However, the increase is accom- 
Plished in the calculations by a reversal in the 
usual sign of the V, entry. This is broughtabout 
by the high intercorrelation between the “Q” and 
the “L” scores obtained in the present sample 
On the 1949 ACE Test. Multiplying the zero- 
Order correlations by the beta weights gives us 
he Contributions to variance for the components 
Which underlie Spelling-M:C. These are: 


“L-type intelligence 34. 2% 
**Q''-type intelligence = 3.7% 


Reflection on the contribution which ‘‘L’’- 
type intelligence makes to the Spelling-M:C test 
15 not hard to justify. The negative contribution 
Which the ««Q^-score makes to the test, however, 
AS more difficult to handle. Apparently, it means 
Sither that the true contribution of the “Q”- 
Score is zero, and the negative contribution is 
ἃ Chance deviation from a true zero relationship 

etween this type of spelling ability and ‘‘Q’’- 
Ype intelligence, or it actually means that 
When the linguistic or verbalabilities are held 
Constant, the relationship between “Q” and spel- 
ing ability is slightly negative! Since the Wherry 
inkage formula professes to remove chance 

£rror from the multiple correlation coefficient 
it appears that the latter alternative must be 
€ntatively held—held for verification. 

A flow-sheet diagram of how the first- and 
Second-order factors contribute to Spelling-W:P 
in the first University Group would appear then, 
1S shown at top of the next page. 


HOLMES 341 


3. Third-Order Factors Underlying 
Spelling-W:P Ability 


The linguistic or L-type of intelligence makes 
Such heavy contributions to both types of spell- 
ing scores (the W:P and the M:C techniques must 
both assay ditferent aspects of the same spelling 
ability) that the question arises: Does any part A 
of the verbal ability depend upon a potential to 
discriminate in the area of auditory images? 
Convention should immediately make one look 
with some mistrust at the question just s tated, 
for practice and theory in the field of psycholog- 
ical testing has tended to teach that intelligence 
is basic, be it in terms of a *'general-G"' or in 
terms of “‘primary-abilities’’ such as induction, 
spatial, perceptual, numerical, verbal, etc. It 
is true that Wechsler (49) has called on psychol- 
ogists to recognize that part which ‘‘non-intell- 
igence’’ or ‘‘personality factors'' play, not only 
in life-behavior but also in individual intelligence 
test scores. Again, Davis, etal., (7,11) has 
stressed the importance of cultural differences 
in response to various intelligence test items. 
What the present writer is asking for is a recog- 
nition of the fact that other basic abilities which 
psychological convention does not include as con- 
stituting part of the realm of intelligence or per- 
sonality, nor yet cultural background, may not 
only play a role in school life behavior but also 
in the conventional intelligence test scores. 

Psychological convention will also bring an- 
other train of thought to the reader; that is, any 
correlation found between intelligence and any 
element of an auditory image expresses the part 
that intelligence plays in the function of the aud- 
itory element and not the reverse. To this ob- 
jection, it is well to reflect that statistical au- 
thorities agree that correlation, while indicat- 
ing relationship, does not give cause and effect 
data. In other words, from a coefficient of cor- 
relation one cannot determine what is cause 
and what is effect; such determination must be 
left to psychological insight, logical analysis, 
experimental design, or finally, commonsense. 
Psycho-educational insight into the presentques- 
tion would indicate to the writer, at least, that 
while for statistical convenience a logical se - 
quence of substrata elements must be postulated 
for any school subject, in reality a mutual3-and- 
reciprocal causation must actually be the case. 
Mutual-and-reciprocal causation may be thought 
of as the relative4-impact-support which factor 
A gives to B and B to A when B is the morecom- 
plicated, complex, or more difficult ability. 
The uses of the more elemental-A in learning 
to use the more difficult-B results not only in 
development of B but also in a sharpening, a 
differentiating refinement, a knowledge of, and 


3e A works on B, B on A, and AB may work on Ce 
* Not necessarily equal in both directions, 


342 


JOURNAL OF EXPERIMENTAL EDUCATION 


(Vol. XXI 


Spe lling-M:C 
(84.375-19. 8:;) 


Spelling-W:P 
(100%) 


“L”-intelligence 
(22. 0%) 


“Q”-intelligence 
5 (-2. 4%) 


Tonal Memory 
(5. 0%) 


Pitch 
(3. 1% 


Unaccounted for 
(21. 6%) 


facility with the element 
tion-of-cause-and-effec 


the “L”’-score, they 
variance are: 


Intensity 
Pitch 


4. Summary of the Statistical Analysis on the 
First University Group 


Spelling ability 
write-in-type test an 
ple-choice type test. The dictation-write-in- 
type test was used ας the criterion “spelling a- 
bility” as it was believed to more closely sim- 
ulate the functional Spelling ability Which the 
Student uses when he is Writing, thinks ofa word 
and must recall how to Spell it rather than se- 
lect it by recognition, ]t is true that he may re- 
call several ways to spell and misspe]] the word 
and then make a selection from FeCognition, but 
still recall plays the major role, 

When dictation-write-in Spelling ability was 
used as the criterion, a multiple Correlation 


9.7% 
4. 505 


; ice type | 
analysis revealed that the ώς the var 
9f spelling makes a high ο ος, 64. 
lance of the criterion, The — to exclude 
Percent, is not so high, however, à might mak 
the contributions which other n Memory 
to the criterion. Two factors, Tona onal indè” 
and Pitch, were found to make addi ble 
pendent contributions. " rial 

When the multiple-choice &peltiug =e it was 
was in its turn used as a sub-criteri e onthe |. 
found that the “L” or linguistic sap its vari 
ACE would account for 34,2 percen e, however: 
ance. The “Q” or quantitative scor ariance $ 
made a negative contribution to the V t, Further 
When the “L” score was held ac pecie br 
Spelling-M:C variance accounted for rticular 6 
be increased by the addition of the Pa tigation 
ements of auditory images under inv τω 
in this group. oning of intel 

one assumes that the functions py basic 
Bence as usually defined is facilitate the func" .— 
abilities in allieg areas and also that ay be fa 
tioning of these allied basic abilities xu , then 
Cilitated by intelligence as usually de tion-of- 
one may Postulate a mutual-reciproca two fac” 
cause-and-effect reaction between the part of 
tors. Once this hypothesis is tenable, Tally 
the relationship which mteiigencs ο ας o. 
defined has to spelling may be a a nypoth 
terms of the allied basic abilities. T inguistiC 
esis was tested: Part of the “L? or ake for 
intelligence Variable as it functions ues ac- 
individua] differences in spelling can in ability | 
Counted for py individual differences us of aud 
to discriminate in some of the elemen 


June, 1954) 


itory images. The substrata analysis indicated 
that both intensity and pitch exert such a func- 
tional influence on the verbal or “L” score. 

The major hypothesis: that the elements of 
the auditory image play a role in accounting for 
individual differences in spelling ability, there- 
fore, is substantiated at the university level as 
well as at the high school level. 


D. The Substrata Analysis of the Second Uni- 
versity Group 


Table VII presents the six intercorrelations, 
means, and standard deviations for the 102 stud- 


ents which constitute the Second University Group. 


In the light of results obtained on the High School 
Group and on the First University Group, two 
Questions naturally arise: (1) How would phon- 
etics correlate with spelling at the university 
level? (2) What part of success in the Phonetic 
Association Test can be attributed to the Lingu- 
istic Intelligence Test Score? 

Inspection of Table VII reveals that the zero- 
Order τ between Spelling-M:C ability and **L''- 
type intelligence is . 509; between Spelling-M:C 
and Phonetic Association the τ is .637, The in- 
terrelationship between “Q” and “L” intelli- 
Sence is expressed by an r of . 420, consider- 
ably lower than the Q-L intercorrelation found 
9n the 1949 version of the ACE used on the First 
University Group. 


l. First Order Factors Underlying Spelling-M:C 
Ability 


, Solution of the multiple correlation problem 
8ives Phonetic Association and “L” intelligence 
35 the first and second tests selected as making 
Independent contributions to the criterion Spell- 
In£-M:C. The multiple “R” rises from .637 to 
«975 when “L” is added to Phonetic Association. 
Solution of the beta equations gives the following 
Contributions which these variables make to the 
Variance of Spelling-M:C: 


Phonetic Associations 32.5% 
“L-type Intelligence 13.5% 
Unaccounted for 54. 0% 


2. Second Order Factors Underlying Spelling- 
M:C Abilit 


Α substrata analysis of Phonetic Associations 
pn the second order of magnitude reveals further 
t the *««T,"-type Intelligence will account for 
-T percent of the phonetic association score. 

US, a flow-sheet diagramming of the analysis 
°f Spelling Ability for the Second University 
Τουρ would appear like the diagram at the top 
the next page. 


HOLMES 


3. Summary of Statistical Analysis on Second 
University Grou 


Lmversity Group 


The analysis of the Second University Group 
may be briefly summarized by answering the 
two questions posed in the opening paragraph in 
Section D. First, phonetic ability does play an 
important part in the Spelling ability of univer- 
Sity students, though not quite so much as it 
plays in the spelling ability of high school stud- 
ents. Second, the **L'-type intelligence makes 
only a 13. 5 percent contribution directly and a 
7.4 percent contribution indirectly through phon- 
etics to Spelling-M:C ability. Therefore, an 
appreciable amount of the variance not account- 
ed for in the primary analysis may be accounted 
for by elements of auditory images. Likewise, 
analysis of the phonetic variable in relation to 
the other group indicates that after the *«I,"- 
variable has been taken out, the elements of 
auditory images still may function to make siz- 
able contributions to spelling ability. Finally, 
the ‘‘Q’’-score makes no contribution to Spell- 
ing-M:C ability when the **L'' is held constant. 


SECTION IV 
SUMMARY AND CONCLUSIONS 
A. Summary 


1. Hypothesis: Individual differences in spel- 
ing ability may be accounted for not only by in- 
dividual differences in intelligence and vis ual- 
word-form-discrimination, but also by some 
of the elements fundamental to auditory images. 
The unknown part which kinaesthetic images 
play in spelling ability is recognized but is not 
under investigation at this time. 

2. Purpose: The hypothesis substantiated, 
the experiment aims at isolating those elements 
of auditory images which make detectable con- 
tributions to the variance of spelling ability at 
the high school and university levels. 

3. Populations: A sample of 227 students 
was drawn at random from English and Social 
Science classes in a large urban high school to 
form the High School Group. Two samples of 
91 and 102 students were drawn at random from 
the writer's Educational Psychology classes to 
form the First and Second University Groups. 

4. Variables: In various combinations ac- 
cording to the sample being tested, the follow- 
ing variables were investigated: dictation- 
write-in type of Spelling-W:P ability, 5-choice 
multiple-choice type of Spelling-M:C ability, 
age, phonetic association ability, quantitative 
or Q-type intelligence, linguistic or L-type in- 
telligence, tonal memory, quality, intensity, 


343 


ol. XXII 
44 JOURNAL OF EXPERIMENTAL EDUCATION (V 
3 


Phonetic Assoc. 
(32. 565-7. 452) 


Spelling-M:C 
(1002) 


"L''-Intelligence "L''-Intelligence 
(13. 5%) (7. 4%) 


Unaccounted for 
(94. 065) 


movement, time, rhythm, Pitch, melodic 


he 
tonal the tertiary level, it was found that two of t 
taste, and consonance or harmony. Most of the 


ider~ 
elements of auditory images could be cp 
variables were assessed on two different instru- ed to be fundamental components of the pitch. 
ments but with some overlap from sample to Score. These elements are Intensity and the 
Sample in accordance with the experimental de- A theoretical discussion is given to dun es 
Sign. Writer's position and hypothesis when he the 
9. Method: A Substrata multiple Correlation that PSychologists and teachers recognize tion 
analysis was mage on the intercorrelation ma - fact that other basic-abilities which conven à 
trices Obtained on all three Samples. The Sub- does not include ας constituting a part of t ]tur- 
Strata analysis allowed the elements which in- realm of intelligence personality, or of κ in 
fluence Spelling ability to be Precipitated from al ifferences may also play a role—not on = 
C matrices in accordance witk t. depth from behavior to life situations ‘but also in the sat 
. 7 . 
ο. they function. ventional intelligence test score. This πε. b- 
“A. Results: For the High School Group esis developed the data and arguments are 
iie test-factors Which underlie Spelling-M:C mitted for the reader's consideration it 
2b. ity WS Mi d level are phonetic associ- Figure 2 presents for the First Universi S D 
nS and tona movement, When Phonetic as- Group the izati gle 
iin Organizat bstrata i 
Sociations were used as the Sub-criterion onthe whieh Gini. ve Lir iei - a 
Secondary level, the test-factors tonal moye- ling-W:P have been pro-rated through the ae 
pee, P "e Ted Were found to make Criteria subsumed Within the major criterio". 
‘ndependent contributions When tonal moye- A ; a to 
: cc ; then 
ment was used as the ς ub - criterion on the of 11 εως, E Toe ea variance á 
secondary and tertiary level, the test-factors may be attributed t : à E x s f auditory imag" 
ris d. I , wd E memory; Bien at the university leve] "Tonal Memor ow 
r and melodic taste. : : 5, 
: ο. οὐ 8 i MEE butor”. ο 
When the contributions of elements are pro- While ity = the Significant a of spel - 
rated through the Sub-criteria Subsumed Within ing may als ê p ες. of the ge h school lev 
the major criterion, Spelling-M the organi- el by the elements pt RP . only 3 r 
zation of the Variables with their percent con- little more than a tenth, ; e i pO accountedfo: ^ 
may be 
sents this Organization. oa τ college level. This is true in Sp ich 
6-B. Results: For the First University Group hould watt’ Criterion Ween inthe bwo sa 


the test-factors found to under 


f auc" 
in type of Spelling-W.p abilit 


4 E : Should "e ion O 
lie dictation-write- itory inane ο. Che eee 


; °S at the college level than at the 
y on the primary sc g re 
redis ening MC, tonal memory, and fiat gen d i at er Were aad 
pitch. Two factors (with their Variance) were udents become older, m 


found to underlie the Sub-criter; 
The '*L"'-score of the ACE Intel] 


tributed 34.2 percent. The “Q”’=scor ~form images so that the auditory t 

negative contribution to the Spelling € i ager e es depended upon to the ae | 

which amounted to 73. T percent! None of the OWever e The possibility is um pasis 

elements of auditory images investigated Could of Visual-word fee Survival is S 

be precipitated out at the secondary level, When Start with apr "i m abilities and ig d mu 

the **L''-score was used as the Sub-criterion on change or κος Hio eee κο call an ei as 
no 


me more adept in the use of ele~ 


Substrata Layer from Which Elements Function Pro-rated 


OO Á————————Á— JUDÓÉÁ——— Contributions 


Criterion Primary Secondary Tertiary toSpelling-M:C 
Tonal Memory T. 
(14. 2%) 
Pitch . 8 
(11. 4%) 
Rnythm Jt 
(11. 0%) 
Melodic Taste . à 
( 3.2%) 
Tonal Movement 4.0 
(13. 6%) 
Pitch B.1 
(10. 6%) 
Intensity 1.4 
( 2.9%) 
Phonetic Assoc. 35.3 
l (48. 400) 
Spelling-M:C Unaccounted for: (Visual, Kinaesthetic, Intelligence, Personality and 46.1 
(10000) (46. 1%) factors of teaching methodology) 
Tonal Movement 3.9 
( 5.5%) 
Tonal Memory , 9 
(14. 2%) 
Pitch 6 
(11. 400) 
Rhythm .6 
(11. 0%) 
Melodic Taste 9 
( 3. 200) 
100. 0% 
Figure 1 


Flow Chart, Diagrammatically Showing the Functional Relationships of the Elements 
of Auditory Images Which Underlie Spelling Ability at the High School Level 


(Ρ461 'eunf 


SAWTOH 


GPE 


Substrata Layer from Which Elements Function Pro-rated 
Contributions 
Criterion Primary Secondary Tertiary to Spelling-W:P 
Intensity 2.1 
(9. 7%) 
L-Score 18.9 
(34. 200) 
Pitch 1.0 
(4. 5%) 
Spelling-M:C 44,7 
(64. 3%) 
= 2.4 


Spelling-W:P Unaccounted for 27.6 
———————À€ 
(100%) (27. 6%) 
Tonal Memory 9.0 
—-————————  HÉR 
( 5. 0%) 
Pitch 3.1 
( 3. 195) 


Figure 2 


Flow Chart, Diagrammatically Showing the Functional Relationships of 
the Elements of Auditory Images Which Underlie Spelling Ability 
at the University Level 


NOLLVOnGS3 ΙΨΙΝΠΙΠΗΠάΧΠ 30 'IVNHÜOf 


HXX 19A) 


------ v 


June, 1954) 


one who is apt to fit into our visually-dominant 
educational program! 

6-C. Results: For the Second University 
Group the factors found to underlie the recog- 
nition type of Spelling-M:C ability are phonetic 
associations and the ‘‘L’’-type intelligence. On 
the secondary level, *T,"-type intelligence con- 
tributed 22.7 percent of the variance to phonetic 
associations. Neither in the primary nor sec- 
ondary analysis did the ‘‘Q’’-type intelligence 
make a significant independent contribution. 
Therefore, the negative contribution found inthe 
First University Group can, perhaps, be dis- 
Counted as an expression arising from chance 
fluctuations from sample to sample; however, 
it must be remembered that the *«L"-factor was 
tested on a different edition of the ACE Test in 
the Second University Group. 


B. Conclusions and Suggestions 


Indicated limitations preclude sweeping gen- 
“Talizations which might go beyond the popula- 
tions sampled; however, a few conclusions seem 
Warranted: 


1. Spelling ability at the high school and uni- 
Versity level depends to a large extent upon abil- 
ity to handle phonetic associations. 

2. Both spelling ability and phonetic ability 
are dependent to some extent upon the ‘‘L”’ or 

‘nguistic type of intelligence. 

3. The “Q” or quantitative type of intelligence 
test Score makes no contribution to ability to 
Spell when the «I,"-type of intelligence is held 
Constant, realizing, of course, that some “Q” 
“Β Probably incorporated in the *'L''-score. 

4. Some of the elements of auditory images 
make an independent contribution to spelling a- 
bility at both educational levels even after phon- 
etic ability and **L''-type intelligence are held 
Constant, 

5. The elements of auditory images which 
appear to make contributions to spelling ability 
~ither directly or indirectly are tonal movement, 

l memory, pitch, intensity, rhythm, and 
elodie taste, Together they account for nearly 

Percent of the variance in spelling at the high 
1 001 level and about 10 percent at the college 

Ve]. 


6. For ator, the inference would 
Seem to ty a a a in at least the listed ele- 
ents of auditory images might be a fruitful ave- 
aie by which to attack the problem of teaching 
“peiling readiness, especially to remedial cases 
n have failed to learn by the usual visual- 
rd-form method of teaching. 
The major inference growing out E ΕΗΒ 
Study is that the functional efficiency of “L . 
Score abilities is in part dependent upon ele 
“nts of auditory images. 


HOLMES 347 


C. Suggestions for Future Research 


Now that the fact is established that there ex- 
ists a definite relationship between certain ele- 
ments of auditory images and spelling ability 
it remains for future research to determine i 
whether or not teaching for improvement in these 
elements will improve a student’s ability in them 
and consequently improve his Spelling. Further, 
new tests should be devised to probe for other 
auditory elements which might bear a relation 
to spelling ability. 

Several hypotheses have grown out of this 
study. Those which might warrant further in- 
vestigations follow: 


1. Auditory images are important to reading 
ability, especially speed of reading, both silent 
and oral. 

2. Auditory images play a greater role in 
determining spelling ability at the elementary 
school level than they do at either the high 
school or university level. 

3. When the ‘‘L’’-type or verbal intelligence 
is held constant, the relationship between spel- 
ling and the ‘‘Q’’-type intelligence will be either 
zero or negative. 

4, Success in conventional intelligence tests 
as well as specific school subjects depends not 
only upon ‘‘G’’, primary mental abilities, per- 
sonality factors, and cultural background, but 
also upon such other basic abilities as those 
which might be considered fundamental elements 
in visual (46), kinaesthetic (31), and auditory 


images. 


BIBLIOGRAPHY 


1. Alper, T. G. ‘‘Diagnostic Spelling Scale 
for College Levels: Its Construction and 
Use, " Journal of Educational Psychology, 


XXXIII (1942), pp. 273-290. 
2. Archer, C. ‘‘Transfer of Training in Spell- 


ing, " Elementary English Review, V 


(1928), pp. 55-61. 


3. Ayer, F. C. A High School Spelling Vocab- 


ulary (Austin, Texas: The Steck Co., 
1945). 

4. Buckingham, B. R., and Dolch, E. W. A 
Combined Word List (Boston: Ginn and 
Co. , 1936). 

5. Carroll, H. A. ‘‘Generalization of Bright 
and Dull Children; A Comparative Study 
with Special Reference to Spelling, ’’ 
Journal of Educational Psychology, XXI 
(1930), pp. 489-499. 

6. Cook, W. A. and O'Shea, M. V. The Child 


and His Spelling (Indianapolis: Bobbs- 
Merrill Co., 1914). 


7. Davis, A. Social Class Influences Upon 


JOURNAL OF EXPERIMENTAL EDUCATION 
348 


i he Inglis lecture (Cambridge, 
Ὃ ο Mab, ee Press, 1948). 
8. Davis, G. ‘Remedial Work in Spelling, " 
` Elementary School Journal, XXVII (1927), 
5-625. 
9. Di TAS D. F. ‘“Misleadings vs. Actual- 
` “ities in Spelling, " American School Board 
Journal, CXX (1950), pp. 33-34. | 
10. Dolch, E. W. Better Spelling (Champaign, 
C TI Garrard Press, 1942). 
11. Eells, K., etal. Intelligence and Cultural 
Differences (Chicago: University of Chi- 
cago Press, 1951). 
12. Fernald, G. M. Remedial Techniques in 
Basic School Subjects (New York: McGraw- 
Hill Book Co. , Inc., 1943). 
13. Garrett, H. E, Statistics in Psychology and 
Education (New York: Longmans, Green 
and Co. , 1947). 
14. Gates, A, I. ; and Chase, E. H. *fMethods 
and Theories of Learning to Spell Tested 
by Studies of Deaf Children, " Journal of 


Educational Psychology, XVII (1926), pp. 
289-300. 


15. Gates, A. I, ; and Russell, D. H. Diagnos- 
tic and Remedial Spelling Manual (New 
York: Bureau of Publications, Teachers 
College, Columbia Unive 


rsity, 1940), 
16. Gilbert, L. C. , and Gilbert, D. w. “The 


17. Groves, J. w 


«ΤΠΕ Relative 


Influence of 
Visual and Auditory Factor : 


19. Hartman, G, W. Educational Psycholo 
(New York: American Book Co. 1941 


20. Hildreth, G, “Spelling in the Mode 
Program, 7 Nat 


ional Elementar Principal 
XX (1941), pp. 476-483. i 


21. Hincks, E, 


M. Disability in Reading and it 
Relation to Personalit (Cambridge po 
Harvard 1. SrSORAlity 


nia, 1948. 

23. Holmes, J. A. “Increased Reliabilities 
New Keys, and Norms fora Modified’ 
Kwalwasser-Dykema Test of Musica] Apis 
itudes, ’’ Journal of Genetic pg Cholo 3 
(1952). 

24, Horn, E. “The Child's Early Experience 
with the Letter A, ” Journa 


lof Educati 
Psychology, XX (1929), pp. 161-168. ional 


(Vol. XXII 


25. Hudson, J. S. and Toler, L. 5 
in Auditory and Visual Discrimi 
Means of Improving Spelling, ”’ Elementary 
School Journal, XLIX (1949), pp. 
469. 


26. Humphrey, K. D. ‘Similarities in the Tench 
ing of Shorthand and Spelling, ” B 
Education World, XXV (1945), pp. 

297; 334-337, " - 

21. Kiefer, F. A., and Sangren, P. V. “An of 
perimental Investigation of the ο... 
Poor Spelling Among University Stuc put 
with Suggestions for Improvement, '' Jour 


25 
nal of Educational Psychology, XVI (19 l 
nal of Educational Psychology 

Pp. 38-47. 


- McCarthy, D. “A Study of the σα " 
Measures of Musical Talent, ” Journal of 


37- 
Applied Psychology, XIV (1930), pp. 4 
455. 


iency inChil- 
29. McGovney, M. «Spelling Deficiency io 
dren of Superior Quality, '' Elementary 


-148. 
English Review, ΥΠ (1930), pp. 146 eos 
30. Mendenhall, J. E. An Analysis of 8 


Errors (New York: Teachers College; 
Columbia University, 1930). sts 
31. Michael, W, p. “Factor Analyses of Te 


hs 
and Criteria," Psychological Monographs. 
LXII (1949). in 
32. Miller, H., and Others. Creative Toa: 
in the Field of Spelling (Des ΠΡ 
Wallace Publishing Co., 1931). id- 
33. Nolde, E. “Outline for a Possible ο 
eration of the Psychological Factors 


οι” 
volved in Spelling, ^ Journal of ducer 
al Psychology, XXXIX (1948), PP- 

1, 


Palmer, M, E, «Abilities possessed xn 
n Good Speller, 7 Elementary πα" 
[ο ΥΠ (1930), pp. 149-150; 160. | 

35. Phelan, M. » γιοι M tion in m 

tion to Variance in Reading and Spelin 

(Washington, D. C.: Catholic Educatio 

Press, 1940). 


Apaport, D, Dia nostic Ps chologice 
Testing (Chicago: Yearbook Publis 


Inc, 
37. R > pon Vol. 2 (New 


34. 


36. R 


Sanoff, A Free Association Test 1). 
38, pork: John Wiley and Sons, Inc., on 
|o 088Se]], y. Characteristics of ο 
τ. Poor Spe llers, Contributions ch* 
vacation, No, 727 (New York: er 
1997, eee, Columbia Univers} 
r ion 
39. Russell, D. H. “Spelling Ability in pRa te 
lema dlng and Vocabulary Achiever a46), 
Elementar nglish Review, XXII ( 

40, ity ἐπ 
0. 8 ne Ano: "Ability and Disabili) P. 
τς mongst Educated Adults, d 
British Journal of Educational Psycho! 

28y, VI (1936), Pp. 123-146. 


Tune, 1954) HOLMES 349 


alPsychology, XXXIX (1948), pp. 337-346. 


41. Seashore, C. E. Seashore Measures in 
; Five RCA Victor re 46. Thurstone, L. L. A Factorial Study of Per- 


Musical Talents, Five RCA Victor rec- 
ords. Distributed by C. H. Stoelting Co., 
Chicago. 


42. Spache, G. ‘Characteristic Errors of Good 


and Poor Spellers, ’’ Journal of Education- 
al Research, XXXIV (1940), pp. 182-189. 


43. Spache, G. “Spelling Disability Correlates. 


Il. Factors that May Be Related to Spell- 
ing Disability, " Journal of Educational 
Research, XXXV (1941), pp. 119-137. 


44, Spache, G. ‘Spelling Disability Correlated: 


Factors Probably Causal in Spelling Dis- 
ability, ’’ Journal of Educational Research, 
XXXIV (1941), pp. 561-586. 


45, Templin, M. C. ‘Comparison of the Spell- 


ing Achievement of Normal and Defective 
Hearing Subjects, " Journal of Education- 


ception, (Chicago: University of Chicago 
Press, 1944). 


47. Tidyman, W. F. The Teaching of Spelling 


(Yonkers-on-Hudson, N. Y.: World Book 
Co., 1919). 


48. Watson, A. E. Experimental Studies in the 


Psychology and Pedagogy of Spelling (New 
York: Teachers College, Columbia Univer- 
sity, 1935). 


49. Wechsler, D. ‘‘Cognitive, Conative, and 


Non-Intellective Intelligence, " American 
Psychologist, V (1950), pp. 78-83. 


50. Whitley, M. T. ‘‘A Comparison of the Sea- 


Shore and the Kwalwasser-Dykema Music 


Tests," Teachers College Record, XXXIII 
(1932), pp. 731-751. 


—s 


ANNOTATED BIBLIOGRAPHY OF PUBLICA- 
TIONS RELATED TO TEACHER 


EVALUATION’ 


WILLIAM A. WATTERS 
Chicago Public Schools 


THIS BIBLIOGRAPHY, relating toteach- 
er evaluation, came about as one phase of work 
of the Committee on Rating in the Chicago Pub- 
lic Schools. Charged with the responsibility for 
revising and improving the procedures of teach- 
er and principal evaluation in the school system, 
the Committee has in two years of activity, sur- 
veyed and studied the literature of researchand 
the philosophical approaches in the aims and ob- 
jectives of school personnel evaluation. Such in- 
vestigation was fundamental to the development 
of the new instruments and procedures now in 
experimental use in connection with the evalua- 
tion of a large sample of teachers and principals. 

It is felt that basic to a professionally accept- 
able evaluation plan is wide participationof school 
Staff in the development and improvement of meth- 
ods and instruments. The Committee itself is 
composed of a cross-section of school person- 
nel from teachers to assistant superintendents. 
Questionnaire techniques have been and are be- 
ing continuously used to determine the needs and 
attitudes of the large number of individuals, who, 
for obvious practical reasons, cannot participate 
as members of the Committee, It is as an as- 
pect of the participatory process that this list 
of annotations will be of value if the easier loca- 
tion of pertinent source material is made pos- 
sible for persons busy with other important ed- 
ücational activities. 

In terms of function, it was desirable that 
the bibliography be extensive in the sense that 
it cover all material directly related to teacher 
evaluation; it was not necessary that it be so ex- 
tensive as to cover all publications having a 
bearing on related topics. For example, many 
articles on the merit-type salary issue contain 
information pertinent to teacher evaluation, and 
these are included, even though the merit-type 


salary philosophy has been rejected by the Com- 
mittee. On the other hand, articles focused on 
salary schedule problems not relating to teach- 
er evaluation are not included, nor are publica- 
tions in the field of teacher competency unrelat- 
ed to evaluation. 

As to the span of years to be covered, the 
publication in 1950 of a comprehensive bibliog- 
raphy of teacher competence by Domas and Tied- 
eman** made it unnecessary to include articles 
published prior to May of 1949, the last month 
covered in that work, which contains complete 
listings in the area of teacher evaluation. Ac- 
tually, all articles appearing during 1949 were 
checked for inclusion in that bibliography, and 
only those not covered by Domas and Tiedeman 
are included in the list to follow. An effort was 
made to investigate and annotate all relevant 
articles from that time through March, 1953. 


The annotations do not attempt value judg- 
ments; the practice was to present as objective- 
ly as possible either the substantive content in 
brief, or a few remarks as to the nature of that 
content. There are some instances where lim- 
itation of library facilities (and time and distance) 
made impractical the annotations. Inthese cases 
the listing is given and the item is marked with 
an asterisk, 

It is hoped that this bibliography will be 

an efficient aid and a positive contribution to the 
cooperative effort to improve teacher evaluation. 
The recommended use will of course be only in 
conjunction with the Domas and Tiedeman work, 
for there is no implication that all the pertinent 
work has appeared since 1949. The Index and 
Classification of Annotations following the Bibli- 
ography should provide an efficient means of lo- 
cating the desired material quickly. 


* A Project of the Committee on Rating, 


*#Simeon Je Domas, and David V. Tiedeman, 


Experimental Education, 


XIX (December 1950), ppe 


Chicago Public Schools, 1953. 


"Teacher Competence: An Annotated Bibliography," Journal of 
99-218. Reference 29 in this bibliography. 


352 


1. Barr, Arvil 8. 
Efficiency, ” 


N 


‘‘Measurement of Teaching 


department of the National Education Asso- 
ciation, 1949. Pp, 251 
This is a review of 
se points: (1) Eva] 
was, is, and al 


-54, 
the subject, making 
uation of teachers 


The trend is toward a 
Pplying more than one 
n one type of data 


ng device, and more than one evaly- 


helping others, 

tice and Journey; 
designation Mast 
qualification in an 
to adequacy in all, 


JOURNAL OF EXPERIMENTAL EDUCATION 


BIBLIOGRAPHY 


Points in Educa- 


niques ? 


of many problems in 3 


[55] 
S 
H 

H 

> 
H 
< 


ased o 


(September 1949), teaching 


Three importa 

(1) Shall there 
within these categories? 

(2) How finely can these differences be 
drawn using the now available tech- 


The teacher as (a 
(b) asa friend an 


(Vol. XXII 


nt questions are posed: 
be merit pay increases 


(3) What ultimate social and educational 
values will be derived from such dif- 
ferentations T 

The answer to the first question is a 

tentative Yes, but dependent on the answer 
use rating is only one method to the third question and for this «‘we need 

ΠΟΘ data, The answer to the second 

question is that we are able to differentiate 

Only within broad Categories, 


i J il S. “Teaching Competencies, ” 
velopment of adequate Encyclopedia of Educational Research, Re- 
means of teacher evaluation, vised Edition 1950. New York: Macmillan 
Four approaches in teacher evaluation Co. p 


uation of performance, 
T; (2) evaluation of 


Cher efficiency, the wider 
erred from behavior 


the teacher is considered: 
) à director of learning, 
d counselor of pupils, (c) 


7 à member of 4 &roup of professional work- 
“ἃ gte. ; that seem to lje ers, and (d) a citizen participating in var- 
pex ormance; (4) eval- ious community activities. Three approach- 
ας Toug da " 65 to teacher efficiency are suggested: (1) 
searc initi i its as- 
= Seine Ph Definitions based on estimates of traits 


Merit Pay for Teachers?» bility, intelligence, etc, » (2) definitions 


* Such ας discovering and defining 
' Pupil needs, settin, oals, stimulating in- 
i pray E terest, ete. , and (3) ders rem 

ὶ red. e from 
ies measures of 


nitions derived 


a revi Pupil growth, There is 
ore than one person, methodand η bis Pertinent PERRE 
i of the four approaches 4. Barr 
l Ὃν Ay. ET. : " iE 
eference), Outlines three Work, » Anena Terit a DE Dae (March 
i PESE (1) Tests for 1952), p 24 = 
; (2) rating scales, check though 43 i i 
T criteria of teacher effect- under ouch pae in the Education Index 
(3) inventories. M 


8 S article is not b Barr, 
but is anitem in the 4 


"Education News Di- 


gest” column of the i her. 
"ΠΝ πὶ τ... ἃ 
only i : 
ough training and exper- Mia 2^ Context, are given. 
a " 
dera. Pere b 9. Barr, Arvils "Measurement of Teacher 
monstrated interests that flics ὃ d Prediction of ani 
n individ i ' Revie search, 
n Special aptis ements να. ΟΠΣ 
The categories of Appren- Tide oon Í recent research in relating 
: eristi; : cd in- 
ο τα Sauer ice eg 
y one of these criteri a : Personal : q lities [4 v dance Buc" 
2 ui cess, Prediction ο teaching Success, and 
miscela us Studies. Α discussion of the 


Tune, 1954) 


criteri . 
iteria of teaching effectiveness and need- 


ed research. Followed by a careful bibliog- 


Taphy of research in this field. 


S; Parn, Arvil S. and others. ‘‘Report of the 
f ommittee on the Criteria of Teacher Ef- 
ctiveness, " Review of Educational Re- 
Search, XXII (June 1952), pp. 238-63. 

An involved discussion of the co m plex 
Conceptual framework in which the Com ~ 
mittee orients the research problem. In 
Submitting this report, the Committee rec~ 
Ommends that clearance be given for spec- 
ific and varied research activities. (There 
is also a minority report signed by one man.) 


* Beecher, Dwight E. ‘‘Objections Answered 
—The New York Plan of Rewarding Good 
Teaching, " American School Board Jour- 
nal, CXIX (October 1949), pp. 39-7. 

Taking up fifteen specific quotations 
from an article by Spaulding (American 


School Board Journal, July 1949) the writer 
criticizes and answers each. Seven posi- 
tive values of the 1947 New York salary law 


are listed and developed. 


7 


8.*Beecher, Dwight E. Evaluation of Teachin 


New York: Syracuse University Press, 


1949. 


9. Beecher, Dwight E. «Judging the Effective- 
ness of benehne, » National Association of 
Secondary School principals Bulletin, XXX- 

mber 1950), PP- 270-81. 
BÉ by on teacher evaluation structured 
around two aspects: (1) Basic Concepts in 
tion of Teaching, and (2) Apply- 


the Evalua : 
m the Basic Principles of Evaluation. ‘‘A 
thorough evaluation of teaching is an essen- 


tialand basic function of supervision. Such 
evaluation should be viewed by teachers, 
supervisors, and administrators as a con- 
structive, cooperative guidance procedure 
aimed at the improvement of instruction. 
Orientation to this concept may best be a- 
chieved through active teacher participation 
in planning and executing the evaluation pro- 
gram. Teacher fear of imposed ratings may 
best be dispelled through the mutual confidence 
provided by such participation. The evalu- 
ation of teachers should be continuous rather 
than periodic; it should be purposeful a nd 
the findings should be used for diagnosis of 
‘strengths and weaknesses in conference with 
the teacher. ... If the appraisal is to be valid 
the criteria used must correspond to the 
basic objectives of teaching in the $c hool; 
only those criteria involving desirable pu- 
pil change, or practices and behaviors ac- 
cepted as promoting such change, will pro- 


353 


WATTERS 


duce valid results. The diagnosis onwhich 
evaluation is based must include a compre- 
hensive analysis of services rendered, with 
adequate, objectively observed evidence of 
performance.... When teaching is judged 
in accordance with these two sets of prin- 
ciples, we may expect increased confidence 
on the part of teachers in those responsible 


for such judgments. "' 


10. Beecher, Dwight E. *t An Evaluation of the 
Attempts of Local School Systems in New 
York State to Include Competence Measures 
in the Salary Schedules, "' Harvard Educa- 
tionalReview, XXII, No. 2 (Spring 1952), 
pp. 132-40. 

How the New York Salary Law of 1947 
was received, and how it affected the edu- 
cation of youth and the teaching profession 
on the local or community levelis described. 
The rather remarkable feature of a salary 
law designed and planned without democrat- 
ic participation, but proposing, neverthe- 
less, a great amount of democratic partic- 
ipation in local communities in the carry- 
ing out of its provisions, is developed. The 
author believes that a great part of what 
unfavorable reaction came to the law was 
due to the interference of the time element, 
which prevented teacher and local partici- 
pation in the formulation of the provisions 


of the law. 


11.*Beecher, Dwight E., and Bump, Janet W. 
The Evaluation of Teaching in New York 
State; Standards and Procedures Recom- 
mended by Local Advisory Committees. 
Albany; New York State Department of Ed- 
ucation, 1950. 57 pp. 


12. Best, Leonard E. ‘Incentive Pay for Bet- 
ter Teaching, " School Executive, LXIX 
(May 1950), pp. 43-4. 

“<A school board member looks at the 
problem of recognizing teacher excellence 
by paying extra for exceptionally meritor- 
ious service. "' 

«Some people say that you cannot evalu- 
ate teachers; therefore, you should pay 
more for teachers with greater educational 
training.... If we pay on the basis of a col- 
lege degree, we really shift the responsi- 
bility from the administrator in our schools 
to the professor in the teacher training col- 
lege. If this evasion of responsibility 
reaches its ultimate extreme, we should 
refuse to pass on initial teacher selection. . 
..Any teacher selection must involve an 
evaluation of fitness for the job, even though 
we have no means of seeing teachers in ac- 
tion unless we visit them at their schools... 


354 


JOURNAL OF EXPERIMENTAL EDUCATION 


ight just as well ignore the three 

ο ΛΓ period of tenure if, at the 
end of three years we are unable to differ- 
entiate between the teaching capacity and 
performances of various teachers. Any 
election to go on tenure involves careful 
evaluation. We must continue to eva luate 
teachers from the time they sign the orig- 
inal contract until they leave the System... 
We should also provide a definite incentive 
to all teachers to do their best." Other 
arguments in favor of a merit salary, 


Α statement of the issues involved in 
teacher rating, the re 


program of appraisal is Continuous 
prehensive, i 


be flexible, 


achers, 


E 18 Pus, is not necessary or 
possible in situations Which enco 


dom of choice, 


growth and development, 
process makes intelligent 
testing instruments which 
can be constructed, 


(7) An evaluative 
use of objective 
are available or 


Brandt, W. J. “Follow-Up of Some Earlier 
Wisconsin Studies of Teaching Ability, 3: 


Journal of Experimental Education XVII 


15. 


16, 


(Vol. XXII 


(September 1949), pp. 1-29. i 

This is a follow-up of the studies made 
by LaDuke in 1945, Rostker in 1945, Jones 
in 1946, and Lins in 1946, under the guid- 
ance of A. S. Barr. This study found co- 
efficients of correlation for various meas- 
ures of teaching ability (rating forms) fo T 
reliability, and also comparisons with var - 
ious measures of Pupil progress. The eer 
cern here is mostly with finding means 0 


predicting teacher Success, 


Brodsky, Charles, «The ‘Spark’ in Good E 
Teaching, " Clearing House, XXIV (Septem 
ber 1949), pp. 41-5. (Editor's Note: Mr. 
Brodsky calls it ‘spark’, You can't oe 
recognize it in the teacher himself.... ane 
it appears over and over again in the clas 
Ses of every teacher who .makes education 

à genuine treat for his students.) 

An eighteen point check-list, mostly " 
Concerning the atmosphere of the ο... 
Such as: Do you really have fun in the clas 
room? Is there so much work going on, a 
vital to the youngsters, that they can’t help 
but talk excitedly to each other from time 
to time, even while you’re trying to get 
them quiet? i 

"Let's see who are the teachers with 
that ‘spark’, They're all kinds—young, _ 
muddle-aged, and old. They're fat andthin. 


and prepare for the next day’s work. ’ 
This teacher 


Brown, Sara Ann. “Technique for Evaluat- 
une the Ability of Teachers to Apply Pr -- 
ples Concerned with the Development Needs 
of Adolescent Girls, » Journal of Education 


Ter chology, XLT (Dae 1950), PP- 


An abstract or à thesis in which the um 
or investigateq the efficiency of paper an 
Pencil tests to Measure ability of teachers 
to apply pr inciples concerned with develop 


Ea needs of adolescents, In Wee doi 
*It can be Teasonap] d thatthea 
ity of individua ος 


ls to apply principles can be 
BN S. and ος ino τρίο]. 
These tests Should make it unnecessary, for 
t S to expend time, onerar 
1n visiti i istribute 
Schools for the ley — teach- 
ers in their Classrooms, » Other teacher 


Ν 


— 3 
MÀ ο ο. 


| 


| June, 1954) 


efficiencies were not studied. 


17. Burke, Arvid J. ‘‘Organized Teachers and 
Public Policy on Teachers’ Salaries, " Har- 
vard Educational Review, XXII (Spring 19- 
52), pp. 150-2. 

Begins with a discussion of the factors 
that have limited the economic effective- 
ness of teacher organizations, and a 10 point 
program of ‘‘positive’’ action for organized 
teachers. (Note: The author appears to be 
discussing ‘‘education’”’ associations, and 
not **teachers unions’’.) In regard to merit 
Salaries: ‘‘Who can judge better... (than 
teachers themselves)...if their effective- 
ness is increased or decreased by policies 
which involve fear, conformity, rating by 
Supervisors, and insecurity, or by policies 
which engender self-confidence, freedom, 
help and guidance from superiors, and se~ 
curity?...’’ 


18. Burke, Arvid J. ‘‘Quality of Teaching Ser- 
vice and Salary Policy, " New York State 
| Education, XXXIX (May 1952), pp. 610-11. 
A brief statement of the viewpoint of 

**Organized teachers” on salary policy rel- 
ative to merit, summarized in ten ‘‘general- 
izations. ’’ One of these states: ‘‘It is prob- 
able that praise, understanding, sympathy, 
helpfulness, and encouragement which re- 
sult in hope, happiness, and self-confidence 
will promote maximum growth and effect- 
iveness in a teaching staff. If rating and 
evaluation destroy this relationship between 
teachers and the administrative or super- 
visory staff, the results may be lower equal- 


ity of service. "' 


19.*Burke, John E. ‘‘What Makes a Good Teach- 
er?" Educational Forum, XVI (January, 
1952), pp. 205-9. 


20. Chamberlain, Leo M., and Kindred, Leslie 
W. Teacher and School Organization, Sec- 
ond Edition. New York: Prentice-Hall, 
1949, pp. 308-17. 

In Chapter X, ‘Working With Supervis- 
ors, " the author discusses the appraisal 
of teaching under three types: (1) the meth- 
od of personal estimate, or subjective judg- 
ment; (2) the teacher rating scale or score 
card; and (3) the evaluation of pupil prog- 
ress. The rating scale is **unquestionably 
of larger importance in. .. (a supervisory) 

..capacity. It makes its greatest contri- 
bution as a check-list and not as a scoring 


instrument. ” 


21, Clark, Elmer J. “Relationship Between the 
Personality Traits of Elementary School 


WATTERS 355 


Teachers and Their Evaluation of Objec- 
tionable Pupil Behavior, ’’ Journal of Edu- 
cational Research, XLV (September 1951), 
pp. 61-6. 

Some types of pupil behavior are more 
annoying to teachers with good mental health 
than to those with poor mental health, while 
other types of pupil behavior annoy the teach- 
ers with poor mental health more and those 
with good mental health less. Details of 
the research upon which this statement is 
based. 


22. Cook, William A. ‘‘Merit Rating and Salary 
Increase, °’ American School Board Journal, 
CXXIV (June 1952), pp. 33-4. 

A strong defense of the principle of re- 
warding teacher merit by paying salary in- 
crements. There is a list of four items in- 
volved in the price teachers pay for not 
having merit rating, and also a list of five 
things the author says the denial of merit 
rating is doing to ourselves. 


23. Cooke, Dennis H. ‘‘Should the Teacher Be 


Paid for Tenure?" Phi Delta Kappan, XXXI 
(March 1950), pp. 302-4. 
Payment of teachers on the basis of years 


of service is questioned. (This is what is 
meant by being ‘‘paid for tenure’’.) ‘‘It has 
been my observation that for the most part, 
tenure protects the weak and ineffective 
teacher and professor. The competent, con- 
scientious, and hard-working ones do not 
need it. " Suggests paying teachers ona 
salary schedule based about 90% on educa- 
tion and years of service, and 10% interms 
of tests of students, tests of teachers, su- 
pervisory ratings, etc. Then as we attain 
more knowledge and better techniques of 
evaluating teachers, the percentage or 
weighting assigned to the teacher's rating 
could be increased accordingly. 


24. Cooke, Paul and Ware, Richard. ‘‘Remoy- 
ing the ‘Hurdles’ from the Salary Schedule 
in Washington, D.C.,’’ American Teacher, 
XXXVII (December 1952), pp. 6-9. 

Α review of the efforts of the American 
Federation of Teachers to change the 1947 
salary schedule law in Washington, D.C., 
eliminating the ‘‘hurdles’’ which made the 
Schedule in reality a merit-salary one. The 
hurdles were subsequently removed in 1951. 
The attitude of the American Federation of 
Teachers toward merit-salaries is present- 
ed. 


25. Coss, Joe G. **Downey Develops Criteria 
for Superior Teachers," AmericanSchool 


Board Journal, CXXI (October 1950), p. 74, 


28. Domas Simeon J, 


JOURNAL OF EXPERIMENTAL EDUCATION 


The author is the District Superintendent 
of Schools in Downey, California, and inthe 
article describes the experience of a com- 
mittee of teachers and administrators to 
evolve criteria of the **superior'' teacher, 
The criteria were used in evaluating proba- 
tionary teachers to be given tenure, and 


Report of a study conducted under the 
intsp 


Corporation andthe Harvard Graduate School 


‘Report of An Exploratory 
Study of Teacher Competence, " American 


Business Education, ΥΠΙ (May 1952), pp. 
305-7. ` 


A brief descr 


iption of the study described 
in reference 27, 


les, books, am- 


of teacher Cofi- 
petence, most of which are ann 


Otated. Clas- 
Sification table makes location of desired 


30. 


31. 


(Vol. XXII 


articles Convenient. It covers publications: 
up through the first half of 1949. This pres 
ent bibliography, while limited to teacher " 
evaluation and closely related subjects, doe 
not repeat any of the Domas and Tiedeman 
listings. It begins where, in point of time, 
they left off. 


Douglass, Harl R., and Mills, Hubert H. 
Teaching in High School, New York: Ron 
ald Press Co., 1948. Pp. 568-89. ale 
Aspects of teacher evaluation, im a a 
uation, and pupil appraisal. Personaltra 
may be listed as teachers themselves Pan 
them, as pupils or administrators seethem. 
Α self-appraisal list of oyer forty wes 
teristics is presented (questions which ed 
teacher may ask of himself), to be mar " 
on a five-point scale, Also discussion 0 
teacher-community relationships. 


Evans, Kathleen M. “A Critical Survey of 
Methods of Assessing Teaching Ability, 
British Journal of Educational Psychology, 
Four entes dr teachin ability are E 
amined: Pupil gain in information, peel 
inion of experts, ratings on various ra - σι 
Scales, and the Opinions of pupils. Non fi 
these is found entirely satisfactory, ο 
the opinion of experts is at present the spem 
Suitable for Seneral use, The available = 
idence suggests that there is little Ara 
ment between assessments made using this 
ferent criteria, but very few studies iln 
Subject have been reported. Probably ld 
best Criterion of any teacher's worth ee 
© 2 Composite measure based on pupil 2 
gain in information, ratings by competent 
Observers, anq ratings based on the op o 
‘On of pupils Having regard to the ial 
€r’s performance when eo 
varied, any such assensmer 
© a Statement of the type dr 
© Size of the class, and of 
Subject matter being taught when it was dé 
The following statements are ma il 
35 arguments against the criterion of pup 
change in measuring teacher eHHCIenc. η 
(1) The principle ic Simple; the applicati 
has proved less simple, rp 30 years no 
Suitable Criterion of Pupil change has €^ t 
merged. (2) ere is need to measure ra 
knowledge Sained, but changes in at: 
titudes, ideals, Purpose, and personality" 
d that groups are compa 
rences measured sai 
!&nificant are not enough; 
à but do not mature ww 
9. (4) Sandiford is quoted to 
effect that changes may be due to good 


June, 1954) 


32. 


33. 


34. 


35. 


teaching and thorough grounding at earlier 
Stages. (5) Changes noted may be due to 
habits of accuracy and industry inculcated 
by earlier teachers, or by parents and 
others outside the school. (6) There is 
question whether the ability to ‘‘get results" 
is good teaching. For example, cramming 
and other cases of information learned and 
quickly forgotten. 


Farmer, Paul. ‘‘Exam for Teachers of Eng- 

lish, ’’ National Education Association Jour- 

nal, XLII (February 1953), p. 81. 
Self-appraisal for English Teachers. 

Ten questions for the English teacher to 

ask of his own performance. 


Flanagan, John C. Critical Requirements 


for Research Personnel. Pittsburgh: Amer- 

ican Institute for Research, 1949. 
Description of the ‘‘critical incidents" 

technique, used later in studies of teacher 


competence. 


Franzen, CarlG. F. ‘‘What Supervisory 
Practices Promote Teacher Growth and Co- 
operation?” National Association of Second- 
ary School Principals Journal, XXXVI (Ap- 
ril 1952), pp. 17-26. 

One of the supervisory practices described 
is the use of *timprovementsheets'* designed 
to help the improvement of teacher service 
in specific areas or subjects. 


Gage, N. L., and Orleans, Jacob S. ‘‘Guid- 
ing Principles in the Study of Teacher Ef- 
fectiveness, " Journal of Teacher Education 
IH (December 1952), pp. 294-8. 

A frame of reference for research is 
presented: 1. Research should be concern- 
ed with teacher effectiveness rather than 
with over-all effect of all the factors in the 
teaching situation (not curriculum, instruc- 
tional materials, etc.,), but does consider 
all influences of teacher. 2. Concerned 
with general conceptions of effectiveness 
rather than with particular conceptions to 
suit special purposes. (Not selection, de- 
termining merit-salary increments, etc. ) 
3. Concerned, at least initially, with con- 
ceptual analysis of the research job; later 
(probably beginning now) more attention 
should be given to details of implementation. 
4. It is beyond the province of the present 
research committee to formulate ends of 
education; therefore, concern is with the 
effective teacher, rather than with the good 
teacher. 

A set of guiding principles for research 
is offered, anda conclusion that many de- 
tailed studies will be necessary and that 


WATTERS 


357 


caution in terminology is necessary. 


. Gage, Nathaniel L. and Suci, George. ‘‘So- 


cial Perception and Teacher Pupil Relation- 
Ships, " Journal of Educational Psychology 
XLII (March 1952), pp. 144-52. Data from 
No. 5. 

Review of a research in which some a- 
greement was found between student ratings 
of teachers and social perceptionscores 
which were a comparison of teachers' in- 
terpretation of pupil attitudes and pupil at- 
titudes as revealed by a questionnaire. 


. Gans, Roma. ‘‘How Evaluate Teachers?” 


Educational Leadership, VIII (November 
1950), pp. 77-81. 

A picture of the short-comings, needs, 
and possibilities for the future in teacher 
appraisal. Some encouraging signs are 
seen in the relationship between curricu- 
lum improvement (as a teacher-participa- 
tion activity) and teacher growth. Three 
avenues for future action: 1. Check, and 
whenever possible undo, practices that 
may be destructive of personality growth 
of teachers. 2. Continuation and further 
extension of curriculum studies in individ- 
ual schools, and on a system-wide basis, 
should be encouraged. 3. Research stud- 
ies planned cooperatively with leaders frora 
other districts should be made. 


38. Gragg, William L. ‘‘Experiences With Mer- 


39. 


it-Salary Promotions, " American School 


Board Journal, CXIX (July 1949), pp. 23-5. 


Describes the 1947 New York salary 
law, and the process used in Ithaca to set 
up procedure. (Election of Committees, 
etc.) Committees drew up standards of ap- 
praisal, set up the administration of stand- 
ards. The evaluation score sheet usedwas 
a weighted rating card: 

A. Direct service to pupils 

B. Community Service 5 

C. Personal Qualities (School, 

non-school) activities, and 
professional growth 

D. Education 5 
The law required that no more than 75% of 
those eligible need be promoted; of 173 
teachers, 33 were ‘‘eligible’’; all 33 were 
marked ''exceptional", and all were there- 
fore given the increases. ‘‘The fact that 
the Board pursued a forward-looking policy 

of promotion was probably the greatest 
single factor in assuring a successful con- 
tinuance of the program. " 


80 points 


[E 


Gragg, William L. ‘‘Ithaca’s Revised 
Teacher Rating Plan, " American School 


JOURNAL OF EXPERIMENTAL EDUCATION 
358 


Board Journal, CXXV (October 1952), pp. 

p e κο ELS ed 

a ae of previous reference 38. 
The changes in procedure brought about by 
the action of a committee of teachers and 
administrators in view of the 1951 revision 
of the 1947 salary law. In the new plan, 
there is no numerical rating, but teachers 
are evaluated by principals through the use 
of an instrument containing the following 
main areas, with Specific suggestions un- 
der each: I. Direct Service to pupils; II. 
Teaching Ability; III. Contribution of the 
teacher to the total School program; IV. Per- 
Sonal Qualities of the Teacher; V. Profes- 
Sional growth of the teacher. 


(The 1951 re~” 
vision of the law removed the requirements 


ry increments, ) 


40, Grim, Paul R., and Hoyt, Cyril J. *CAps 
praisal of Teaching Competency, " Educa- 
tional Research Bulletin, XXXI, No. 4 (Ap- 
ril 1952), pp, 85-91, 

“Our approach of 'getting within the in- 
dividual in Order to see 1 
tically views the world is t 
taking at face value the beli 
Situations are much the same 


rt, ? The Study 
is only in formative shee 


mative Stages, and no Statis- 
tical findings are Yet available, 
devising of the instruments has been to Sonie 
extent an *armcehair?» Process, it is hoped 
that empirical validation DOW in progress 
will provide evidence of the Soundness of 
the rationale, 


41. Grotke, Earl M. “Professional Distance and 
Teacher Evaluation, " phi Delta Ka an 
XXXIV (January 1953), pp. 127-30, 

Two concepts are brought into 
adapted from the vocabulary of 
Professional Distance —defineg 
quency and divergency between 


Sociology. 
35 the fre- 
Points of 


42. 


43. Ham pton, 


(Vol. XXII 


iew held by professional workers as to 
thats role; ee Role of the Good - 
Teacher—the professional y determined 
havior expected or required of “| 
a specific professional position, i. κ” hy- 
position of the classroom teacher. T) Ἢν 
pothesis presented here is that the leng ne 
of professional distance would πας 2» 
ratings decrease from good to fair and farti 
fair to poor. Sincean evaluation of anot e 
teaching is an expression of the differen ον 
between one's own concept of another S n 
ing and one's concept of the professiona di 
role of the good teacher. The data did n m 
Warrant conclusions which would i rut 
that the situation was as ciear-cut as t that 
hypothesis Stated, but there is qe ea 
the effect is involved in principal's eva 
tions of teacher efficiency. 


Hagman, Harlan L. The Administration Of | 
American Public Schools. New York: | 
Graw-Hill Book Co., 1951, pp. 148-53; 
200-202, tic | 
Chapter IX “The Nature of Democra d | 
Supervision, ” discusses ‘inspection πο 
rating" as a ‘low order” of ορ 
Pervision, as Opposed to the ie err 
Social leadership” at the other end of h 
Scale. “Rating Scales can be used wit! ed 
benefit to instruction and may be emp ae 
On any level of Supervision. Teachers ie 
Supervisors may welcome the rating 56 as 
as an inventory of teacher abilities and eed 
a useful tool in the determination of Ine 2 
for improvement, . -.But when the a M 
Scale is used in 4 Supervisory program 


É s at- 
tained only on the level of inspection and T 
ing, the scale is oft 


en an instrument damag i 
Ing to teacher morale...» 


Nellie Delight. «An Analysis a 
ty Ratings of Elementary Tea 

Con Graduated Írom Iowa State ep orm 

9 lege, ^ Journal of Experimental Educ- 
tion, XX (December 1951), pp. δν 

his study had two stated purposes: νε» 

To find out what ratings can tell about 2 P 
ticular group of Iowa State graduates; an 
(2) to obtain Suggestions regarding r ied 
Schemes or Systems in general. Reiter 
to the secong Purpose, there was an ana ing 
Ysis of ratings given at the same time us” 
different instruments, Some conclusione; _ 
“Correlations between success in trait ae 
ings of the same Persons were all pnr 
from zero at the one percent level, trait 
trait, when the raters were the same, 2 
nominally equa} to zero when the raters be- 
Were changed, *-- The high correlations 
tween the trait ratings on the five point | 


Superviso. 


=e 


June, 1954) 


44, 


Scale and the general category rating seem 
to indicate that our individual trait ratings 
actually are adding very little to our knowl- 
edge of our teachers as a group which a 
general category rating could not supply. 
This does not mean that individual trait 
ratings may not be desired for diagnostic 
purposes, but for indications of general 
merit, one general rating may suffice. '' 
There are other conclusions. 


Horrocks, John E., and Schoonover, Thelma 
L. “Self-Evaluation as a Means of Growth 
for Teachers in Service: Use of a Self-An- 
alysis Questionnaire, " Educational Admin- 
istration and Supervision, XXXVI (Febru- 
ary 1950), pp. 83-9. 

A general discussion of the difficulties 
and advantages of teacher evaluation leads 
to a statement that the benefits accruing to 
teachers from evaluation may be most ad- 
vantageously gained through a process of 
self-evaluation. Self-acceptance and coop- 
eration in devising the evaluative proced- 
ure are involved. This seems the optimum 
approach from the standpoint of, (1) readi- 
ness, (2) self-acceptance, and (3) incentive 
for improvement. 


45.**«How Do You Rate with Business 2’’ Grade 


Teacher, LXIX (April 1952), p. 76. 


46. Jarecke, Walter H. ‘‘Evaluating Teaching 


Success Through the Use of the Teaching 
Judgment Test, ’ J ournal of Educational 
Research, XL (May 1952), pp. 683-94. 
7A report of a research project, the pur- 
pose of which was to design a test to evalu- 
ate some of the factors which contribute to 
the success of teachers on the secondary 
school level, in relation to their perform- 
ance in the classroom, associations with 
other teachers, and other aspects. Some 
conclusions: Teacher experience seems to 
have a bearing on teacher success. Some 
unnamed factors, possibly ‘‘stability, 7 af- 
fect teaching success. There seems to be 
a relationship between scholastic ability 
and teaching success as measured by the 
Teaching Judgment Test. There are other 
conclusions on about the same level of sig- 


nificance. 


47, Jensen, Alfred C. ‘Determining Critical 


Requirements for Teachers, ” Journal of 


Experimental Education, XX (September 
1951), pp. 79-85. 
“This paper describes one of the ap- 


proaches employed by the Teacher Char- 
acteristics Study in attempting to define 
criteria of teaching effectiveness. ? Ehe 


WATTERS 


48. 


49. 


50. 


359 


approach described is that known as the 
**eritical incidents technique. " (See Domas, 
Simeon J. ‘‘Report of an Exploratory of 
Teacher Competence, ” ref. 28.) Critical 
requirements are set forth under three cat- 
egories: Personal Qualities, Professional 
Qualities, and Social Qualities. Effective 
and ineffective examples are given under 
each. ‘‘It is suggested that the critical in- 
cidents technique might be employed prof- 
itably in local school situations in develop- 
ing valid bases for teacher evaluation, and 
as an aid to the in-service growth of teach- 


ers?! 


Kandel, I. L. ‘What is Teaching Compe- 


tence?’’ School and Society, LXXIII (May 
1951), pp. 315-16. 

A discussion of the implications of the 
question related to the rights of parents, 
teacher's associations, Board of Education, 
and other groups in the dismissal of a tea- 
cher. The case in point was in a report of 
the National Commission for the Defense 
of Democracy Through Education. The au- 
thor is critical of the position taken in the 


report. 


Kauffman, Grace I. ‘‘How Professional Am 
I? A Self-Test, " National Education Asso- 
ciation Journal, XXXIX (April 1950), p. 288. 

“A Self-Test designed to emphasize the 
positive. " It has items grouped under six 
areas: Teacher-Pupil Relationships, Teach- 
er-Teacher Relationships, Teacher-Admin- 
istrator Relationships, Teacher-Board of 
Education Relationships, Teacher-Public 
Relationships, and Teacher-Professional 
Relationships. 


Lacy, Susan, Miller, John L., and Wardner, 
Phillip. ‘‘Merit Rating: A Symposium, "' 
Educational Leadership, IX (October 1951), 
pp. 17-21. 

A year after the Association for Super- 
vision and Curriculum Development pamph- 
let on rating was published (see reference 
13), three educators state their reactions 
to the viewpoint expressed therein. Susan 
Lacy makes a sympathetic case for the use- 
fulness of the pamphlet, reviewing how 
‘Young Principal" finds it impossible to 
rate, and enlists the cooperation of the 
teachers in continuous evaluation. JohnL. 
Miller is critical of ‘‘Better Than Rating" 
and asks for continuing study of the possi- 
bility of relating salary and evaluation. 
**Moreover, to contend that rating of the 
teacher on tenure is impractical, and at 
the same time to stress the value of pre- 
employment rating, is not consistent. ’7 


360 


51. 


52. 


53. 


wo 


JOURNAL OF EXPERIMENTAL EDUCATION 


Phillip Wardner extols the timeliness and 
virtue of **Better Than Rating. ” 


Lamke, Tom Arthur. ‘Personality and 

Teaching Success," Journal of Ex eriment- 

al Education, XX (December 1951), pp. 217- 
9, 

3 The study sought to answer two questions: 
Are the personalities of good and poor teach- 
ers, as evaluated by Cattell’s «16 Person- 
ality Factor Test" characteristically differ- 
ent? Are the personalities of goodand poor 
teachers, as evaluated by a paired compar- 
ison scale based on Cattell’s «20 Surface 
Traits” characteristically different? Re- 
Sults were not very conclusive, but, for ex- 
ample, it “appears that good teachers are 
likely, more than poor teachers, to be gre- 
garious, adventurous, frivolous, to have 
abundant emotional responses, strongartis- 


ed in the Opposite sex, to be polished, fas- 
Poor teachers are likely, 


ers fail for varying reason: 


S....It appears 
that success may be a ‘ba $ 


lance’, » 


Scale for Measuring 
i and Teacher-Pupil 
Rapport, ” Psychological Monographs LXIV 
(1950), pp. 1-24. i 

A scale developed for measuring teacher- 
pupil attitudes correlated «49, «48, and .45 
with principal's ratings, Observer's ratings 
and pupil's ratings. (Data from Ref, 5) 


Leeper, Robert R. **Fred Brown vs, the 
Elusive Idea] Teacher, " Nat 


μα i ional Education 
Ociation Journa » XXXIX (December 
1950), pp. 672-4, 


A review of the philoso 
of “Better Than Rating, 
which are discussed th 
teacher rating plans: 
1. Teacher rating plans 

individual personality 
2. Rating plans tend to encoura, 1 

rather than acting on thinking. inns: 
3. Most rating plans fa. 
social action. 
4. Most plans lack qualiti 
evaluation 
a. Rating plans are an in 
than a continuous for 


often fail to respect 
il to use Cooperative 
es of Cooperative 


termittent rather 
m of evaluation, 


54. 


55. 


56. 


(Vol. XXII 


and are directly or indirectly imposed 

from the outside, rather than devel- 

Oped as an integral part of the learn 

ing-teaching situation. "— 

b. In most rating plans, evaluation isa 
Consensus of people in status posi" 
tions, rather than a cooperative τ 
Sponsibility of all persons affectedby 
the process. 

Plans often work to prevent, rather 
than to foster and guide, change in 
behavior. 

d. When plans work for behavioral p ex 
the direction of change is imposed zi 
the plan, rather than evolved mee 
atively by the group being evilan 

e. Most plans leave little or no oppor ct 
ity for intelligent selection and use 
the best techniques for gaining evi 
dence of behavioral change. 1 

A “better way" is indicated: The schoo m 
Community must organize in a voluntary ma 
Operative manner to encourage non ipn 
srowth. The school community κ Lape 
its teachers with rich opportunities for Est 
fessional growth and development, and m = 
give attention to several important perso 

nel practices, 


Ω 


Lindsey, Margaret, «Ask yourself 8 enm 
Questions, » National Education Associa 
Journal, Xl, (March 1951), pp. 173-5. 
A discussion of the criteria of the g0 py? 
teacher for Self-evaluation. Are you τ 
Are you social? Informed? Flexible? u 
terested in Children? Democratic? Pro 
of your Profession? 


for 
- “Check Sheet— More Help 
the New Teacher, » 


Clearing House, XXV 
(May 1951), Pp. 548-9, 


A description of à simple check she oe 
for the Supervision personnel to use in ings 
nection with visitation, Only ‘best’? thi 
are found in the list to be checked. 


McCall, w 
ΕΤ Merit 


illiam A. Measurement of Teach 
» Publication No, 284. RAE ig 
- C.: Nort Carolina Department of Pu 
Instruction, 1952, 40 


Report of a research in teacher-merit. 


The genera; Plàn was to measure compre" 
ensibly the growth proditsd in each clas? 
by the teacher or that class, to weight e 
elements of 8rowth according to impor tan 
© Secure 4 Single composite figure for “this 
Srowths made by each class, to correct w 
weighted crude growth for capacity to gro 
and for differences in class size if the aa 
ter appeareg to influence growth, and the ο 
9 correlate a large number of measures 


June, 1954) 


the teacher's traits with this *'purified" 
criterion of each teacher's worth as a teach- 
er. Growth was measured by standardized 
tests in the following areas: General Mental 
Ability, General Education (including Citi- 
zenship, Forseeing Consequences, Under- 
Standing the World, and Prejudice), Range 
of Knowledge and Experience, SocialBehav- 
ior, Creative Composition, and Handwrit- 
ing. More than a score of old and new meth- 
ods of evaluating teachers were compared 
with the pupil growth thus measured. None 
of them correlated high. Principal’s rat- 
ings correlated slightly negatively. Exper- 
ience showed little relationship. Training 
Showed just a little more relationship (rough- 
ly, about 10% of the difference in teacher 
efficiency could be attributed to training). 
The best correlation of all was found to be 
confidential teacher self-ratings (plus 39% 
index of validity, corrected for attenuation 
to plus 59%). Also fairly high was pupil 
rating of their teachers on a social behav- 
ior scale. ‘‘All things considered, this 
study failed to find any system of measuring 
teacher's merit which the writer is willing 
to recommend being adapted as the basis of 
paying the salaries of teachers. This study 
did establish....that the system of merit 
rating by official superiors.... is of no val- 
ue...(p. 37)." ''This whole study stands 

or falls on the acceptability.... of the proved 
ability of the teacher to produce growth in 
pupils (as the criterion)... If the reader does 
not agree that this criterion is the chief and 
proper one, he will be unable to accept any 
of the conclusions of this research. " 


57. McCartha, Carl W. ‘‘The Practice of Teach- 


er-Evaluation in the South-East in 1 9 48, ” 
Journal of Educational Research, XLIV (Oc- 
tober 1950), pp. 122-8. 

Results of a study in which questionnaires 
were sent to city and county school adminis- 
tration units in ten southeastern states. Of 
778 replies received, 671 favored teacher 
evaluation, but only 170 (about 25%) were 
doing any evaluating. Other figures are giv- 
en and observations made. 


58. McGowan, W. N. “The Measure of a Suc- 


cessful Teacher, ’? American School Board 


Journal, CXXI (July 1950), pp. 17-19. 
A re-evaluation of the qualities that mark 


a successful teacher in the light of progress 
made during recent years. 


59.*McNerney, Chester T. Educational Super- 


vision. New York: McGraw-Hill Book Co., 
1951, pp. 117-23. 


WATTERS 


361 


60. Mazzei, Renato. ‘‘Desirable Traits of Suc- 
cessful Teachers, " Journal of Teacher Ed- 
ucation, II (December 1951), pp. 291-4. 

An analysis of the traits approved and 
disapproved in their teachers by junior high 
school boys and girls. Similar tothe Witty 
“Quiz Kids” study. (See ref. 99.) 


61. ***Merit' Rating —What's Wrong With It? 
New York's AFT Members Give the Answer," 
American Teacher, XXXIII (April 1949), 
pp. 7-8. 

A summary of the reasons why the AFT 
opposes the basing of salaries on merit, 
from a letter from the Empire State Feder- 
ation of Teachers Unions protesting the en- 
dorsement of the 1947 New York Salary Law 
by the State Board of the American Associ- 
ation of University Women. Seven main 
reasons for the opposition of the AFT are 
described: (1) The school is not an industry 
and the products are intangibles whose last- 
ing effects may not be apparent until much 
later; they cannot be measured. Who is 
qualified to set monetary values on them? 
(2) The unreliability and lack of validity in 
rating; the influence of personal relations 
and bias. (3) Administrators lack time to 
make even subjective judgments, based on 
visits and observation. (4) The inevitabil- 
ity of a ‘‘popularity contest, "' with teachers 
being ‘‘yes-men’’ instead of cooperating and 
sometimes criticizing. (5) Do we want tea- 
chers or club-women? The teacher often 
has little time to join and participate in 
community organizations. (6) The unwork- 
ability, in experience, of merit-salary laws. 
(7) Teachers, supervisors, administrators 
and outstanding educators oppose it. 


62. Michael, William B., Herrold, Earle E., 


and Cryan, Eugene W. ‘‘Survey of Student- 

Teacher Relationships, ' Journal of Educa- 
tional Research, XLIV (May 1951), pp. 657- 
73. 

It was found that boys and girls prefer 
teachers who allowed voluntary answers to 
questions to those who called on specific in- 
dividuals, and also prefer teachers who par- 
ticipate in extra-curricular activities. Other 


findings. 


63. Miller, Leo R. ‘‘Let Those Who Teach 


Rate for Merit,'' School Executive, LXVIII 
(May 1949), pp. 55-6. 

Proposes a rating system consisting of 
anonymous ratings made for each teacher 
by every other teacher. Such ratings would 
count for no less than the supervisory rat- 
ings made by the principal. There would 


JOURNAL OF EXPERIMENTAL EDUCATION 


ings by supervisors who visit only 
E a me ek of justifications 
are given for the plan, among which are: 
Teachers may be biased and prejudicial, but 
it is likely that 20 teachers will be fairer 
than one administrator; teachers would ben- 
efit from seeing what colleagues think of 
their services. (Data from ref. 5.) 


64. Milwaukee Public Schools, Office of the Su- 


perintendent. Report on Teacher Evaluation, 
Compiled by Harold S. Vincent, Milwaukee 
Public Schools, April 1951, 8 pp. 

Brief report of a Survey of teacher eval- 
uation practices in 22 large Cities, a dis- 
Cussion of the purposes of teacher evalua- 
tion, and recommendations for the Milwau- 
kee Public Schools. 


65, Misner, Paul J. *fTeacher Rating is the Re- 
Sponsibility of the Entire Profession, ” Na- 
tions Schools, XLVII, No. 2 (August 1951), 
pp. 23-4, 
Starting out with a frank criticism of the 
AS. C. D. pamphlet ‘Better Than Rating, '* 


; Illinois, is 
“Rating done exclusively by su- 
a red her- 
n (in ‘Better 


endering Continuing ef- 
fective and Competent service, »» 


Teaching Service, " Har 


vard Educational 
Review, XX (Spring 1952), pp. 124-35 - 
D h 


à ; the solu- 
tions tentatively proposed, and the finalrec- 
ommendations. The treatment is sympa- 


thetic to the New York plan of 


merit-salar- 
ies. 


97. Morrison, J. Cayce. “It’s Time to Adjust 


Salaries to Quality of Teaching,» Nations 


E EE 


68. 


Schools, LI (February 1953), pp. 45-8. 

The case for salary related to merit 
against the background of the experience of 
New York with the Salary law of 1947 re- 
vised in 1951, "Teachers themselves can 
and must make the chief contribution to de- 
veloping the theory and practice of evaluat- 
ing teaching, »» 


Morrison, J. Cayce, and Burke, Arvid J. 
“Basing Salaries on Quality of Teaching. 
A Defense and a Criticism of New York's 
Merit Law, ” Nations Schools, XLIV, No. 
3 (September 1949), pp. 52-4. 

In this defense of the law, written by 
Morrison, and the Criticism written by 
Burke, isa fairly complete description of 
the New York Salary Law of 1947, and also 
a development of arguments both for and 


against such rating of teacher to determine 
salary. 


69.*Mott, E, p, “‘Teacher Failures in the Pub- 


lic Schools,» A ricultural Education Mag- 
azine, XXII (March 1950), pp. 208-9. 


70. National Education Association, Department 


of Classroom Teachers, and Research Di- 
vision. Teacher Rating, Discussion pori" 
phlet No. 10. Washington, D.C.: Nationa 
Education Association, April 1950, 24 pp. " 

An informal discussion of the rating pro 
lem. As to the reasons for rating: ‘‘Esti~ 
mates of a teacher's work are necessary _ 
for two purposes: As a basis for administra 
tive decisions, and as a basis for improv 
ing instruction, The merit ofa teacher " 
is a guiding Principle for each of these i 
POSeS." Qn the Criteria of good rating: s 
Should be objective, reliable, and valid, 2 
a measure of teaching success or some im 


€n rater and teacher 

und to occur when one T 

adult judges the Work of another . " The fo 

lowing are Suggested as a basis for rating, 

ndation that a composite - 

Pupil progress (several dis 
loned), methods of teach 


utions to School 


The types of -ᾱ- 
n use are: Check list, Εξ 
; et, characterization report, 

descriptive Teport, ang ranking report. AS 
Ὁ rating, the use of rin 
nnel Record is suggested. 
“The Collection of Supporting ο". for 
tating opens the Way to the use of a cumu- 

ΓΘ recOrd, ^ such a file would contain — 
evidence presented before employment, ev! 


(Vol. XXII 


f. 


June, 1954) 


πα, 


72. 


73. 


dence of work done, indications of prestige, 
personal growth. A table is presented with 
Statistics relative to teachers' opinions on 
rating. 


National Education Association, Research 
Division. ‘‘Teacher Personnel Procedures, 
1950-51: Employment Conditions inService,’’ 
Research Bulletin, XXX (April 1952), pp. 
ο. See ‘‘Appraisal of Service, ” pp. 

-9. 

Two tables are presented with a number 
of interesting and pertinent statistics on 
teacher evaluation. Cities are classified 
into groups, according to population, and 
the statistics presented for. each group. 
Some of the items on which figures are giv- 
en: Number of cities rating various groups 
and classifications of employees, how many 
give copies of rating to the teachers, num- 
ber which use no ratings, types of rating 
Scales in use, and the uses made of the rat- 


ings. 


National Education Association, Research 
Division and American Association of School 


Administrators. Promotion and Appraisal 
Procedures in City School Systems, 1950- 


51, Educational Research Circular No. 2, 


(February 1953), 28 pp. . 
Contains two tables from questionnaire 
research data. One is concerned with Pro- 
cedures in Selection of Teachers for Pro- 
motion and the other with Methods and Pro- 
cedures of Appraisal. Much of this data is 
the same as that given in the previous ref- 


erence 71. 


Ojemann, Ralph H. “Identifying Effective 
Classroom Teachers, ” in Bases for Effec- 
tive Learning, Thirty-First Yearbook, De- 
partment of Elementary School Principals, 
National Education Association (September 
1952), pp. 130-8. 

An attempt to answer the question, ‘‘What 
kinds of teachers are effective in the class- 
room?"' ‘We cannot locate the effective 
classroom teacher by taking into account 
only knowledge of subject matter taught, or 
interest in teaching. ...or any other single 
factor. The effective teacher, in the light 
of our present knowledge, has a combination 
of several factors. ...If we wish to locate 
the teacher who develops favorable attitudes 
in children, ... we should observe whether 
(he) is interested inworking with them, rather 
than in dominating them; whether heattempts 
to guide them by fear, ridicule, sarcasm, 
and partiality, or by a laissez-faire or in- 


consistent procedure. "' 


WATTERS 


74. 


Orleans, Jacob S., and others. Some 
Preliminary Thoughts on the Criteria of 
Teacher Effectiveness, " Journal of Educa- 
cu Research, XLV (May 1952), pp. 641- 
*t*Perhaps a major weakness of education- 
alresearch has been the failure to do the 
basic thinking which is needed to insure that 
the right questions are being asked and that 
sound planning is being done. In dealing 
with a problem which is as abstruse and 
complex as that of teacher effectiveness, 
much time must be devoted at first to read- 
ing and to deliberations. The present paper 
summarizes the thinking done in this earlier 
stage." The paper presents a comprehen- 
sive look into the difficulties which must be 
faced by workers in this field. The basic 
concept is that the ultimate criteria of teach- 
er effectiveness is change in pupil behavior; 
but that proximate criteria, or predictors 
of the ultimate criteria, may be found and 


identified. 


75.*Parent, N. J. ‘‘Evaluation of Good Teach- 


76. 


ΤΊ. 


78. 


ing, ” Michigan Education Journal, XXX 
(October 1952), pp. 179-80. 


Pearman, W. I. ‘Toward a Supervisor- 
Teacher Partnership in the Evaluation of 
Teaching, ” High Points, XXXI (May 1949), 


pp. 22-30. 
The evaluation referred to is that which 


takes place when or after a supervisor visits 
a teacher. The partnership is in planning, 
carrying out and evaluating the observed 
teaching. 


“Policies Relating to Salaries for Teachers," 
Journal of Teacher Education, III (June 1952), 
p. 113. 

A statement of policy adopted by the com- 
mission in 1952, which states minimums 
for experienced and inexperienced teachers, 
involves single salary standards based on 
preparation and experience and does not 
mention merit salaries. 


“Rating Probationary Teachers," American 
School Board Journal, CXIX (July 1949), pp. 
30. 

A new form for rating probationary teach- 
ers, worked out by a committee and pre- 
sented to the New York City Board of Super- 
intendents, to replace an obsolete scheme. 
Principals are asked to check the compet- 
encies of beginning teachers under five 
headings and to add recommendations for 
permanent appointment; (1) Personal and 
professional Qualities (8 sub-headings), (2) 


1 
364 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XX. 


tion 
i i d Instruction (10 sub-head- the research proceed on the assump I " 
on eee eee (4 sub- that the criterion consists at cyan ca 
F Ἷ icipation i i ? By using trained observer 
i 4) Participation in School and Sions? y ng epe 
cone (3 Sub-headings), and with a prepared form (the a wager ο. 
(6) Principal's Estimate of General Fitness vation Scale), noting both teacher anc i 
(Additional Remarks). behavior, and applying factor analysis, ‘at 
was indicated that the criterion did gal 
79.*Rechard, O. H. “Appraising and Rewarding of Several dimensions, and that these 
^ Teacher Effectiveness, 7 National Confer- associated for 


predictive purposes. Five 
ence on Higher Education Addresses, (1951) factors emerged from the analysis: χά 
Bp. 19458, 1. Factor A: Appears to be define 


ility and 
terms of originality, adaptability an 
80. Reeder, Ward G. The Fundamentals of Pub- 


tolerance, ized 
lic School Administration, Revised Edition. 2. Factor B: A business-like, organi 
New York: Macmillan Co. ; 1951, Chapter approach. 
IX, pp. 216-36. 


3. Factor C: Two clusters, (a) ameen 
standing, kindly, fair ο aå 
(b) tendency to be composed, ste 


Discusses philosophy, need and methods 
of evaluation. Questions for discussion. 


easy-going. : 

81. Rogers, Dorothy. ‘Implications of Views 4. Factor D: Approachable, friendly, 

Concerning the Typical School Teacher, ” tactful, gregarious. " = 
Journal of Educational Sociology, XXIII 9. Factor E: Related to superficial ap 
(April 1950), pp. 482-7, pearance, as physique, voice, etc. 

Lists of twenty-one adjectives were giv- “It is of interest to note that the pupil be d 

en to students and to teachers with instruc- havior traits contribute significantly to fac 
tions to mark those most often used by the tors A and B but that these traits have is 
public to describe teachers. Inferencesare zero loadings on Factors C, D, and E. Thi 
drawn from what Students believe the Public may suggest that pupil behavior 1π 014.555 
thinks of teachers, and from what teachers 


may be, toa Considerable extent, ιν 
tion of the teachers ability to (1) stim 


ἡ s 
E ea the pupils and (2 intain effective clas? 
82. Ryans, David G. «The Criteria of Teaching control i, s BEC eee es 
Effectiveness, "' Journal of Educational Re- the pupils are alert and responsible andar 
ar j η (May 1949), pp. 690-9. Participating in constructive activities. 
_ Out of the problems and experience relat- Another interesting fact brought out wa 
ing to the National Teachers Examinations y: 


5 i i ween 35" 
emerge a quest for criteria of teaching ef- with regard to the correlations bet 


y- 
: : ide Sessme ifferent obser 
fectiveness, « Teaching 15 effective to the ers, oe . his E rhe. 68 to 0. 84 - 
extent that the teacher is able to provide when the assess me ts of all teacher s obseF 
ways and means that are favorable to the ved by the Fame ohe de were consider" 
Μο a ημων ο Work habits, ed, ^ This Study is a Ede of a research on" 
sirable attitudes, and a equate persona] Subsidj tionand SP 
adjustment on the part of the Pupils." The sored by ον) ας eg oe a Educa- 
$5, on ene are Ὁ Ratings of teach- tion. It is an Ped Yin known as 
> an measures o; Pupil change, = 7 isti 
ter Tace wine the difficulties, κα[οροο αν "He Teacher Characteristics Study. 
and lack of research relating to each, itis 84 pe 
η . η € R a i i σε 
Suggested that it is important to fing Sec- Pie ιά n NE 5 om ο... and 
ondary criteria, or factors that reflect the Persona] πας Certain .. tiveness of A 
basic criterion; those could be used as reli- Teacher Beha eris "E TE erime? 
able guides to teacher effectiveness, al Education Xx aop ee 
a ώ Ες ; pP. d the 
83. Ryans, David G. «A Study of Criterion Data," A Progress report on the work of the 
Ὁ : Teache isti der th 
(“A s nig p of Teacher Behaviors in directorship eter istics wb m ob- 
the me Choot”), Educational and Servers judge ihe at s of claSS~ 
1982), pr. ορ geen XII (Autumn room teaching vry-€pepe peen 
An answer was Sought to the question: Compared with the results of professiona 
Should the sius analysis of predictors ans personal inventories, 
(of teaching success) be Carried out with re- 8 : 
gard to a single overall cr iterion, or should omes Douglas E, «Tho Good Teacher , 


Shing Criteria for Identification, 


June, 1954) 


86. 


87. 


Journal of Teacher Education, I (June 1950), 
pp. 137-41. 

Discussion of the complexity involved in 
all attempts to develop criteria of teacher 
Competence. ‘‘The defining of good teach- 
ing is not impossible; it is difficult because 
of the psychological subtleties and because 
of the interplay of many factors. It cannot 
be accomplished in terms of any single pat- 
tern of characteristics unless these are made 
very general.” ‘‘Through a careful balance 
between research studies and careful insight- 
ful thinking, we can expect to make progress 
even though this area may be difficult. 7 


Schwartz, A. N. “A Study of the Discrim- 
inating Efficiency of Certain Tests of Pri- 
mary Source Personality Traits of Teach- 
ers, ” Journal of Experimental Education, 
XIX (September 1950), pp. 63-93. 

The problem attached in this study was 
the predicting efficiency of certain person- 
ality tests as far as teaching success was 
concerned. These were measured against 
the criteria of marks, ratings in student 
teaching, and ratings in service. A few, 
but no very significant relationships were 
discovered between the tests and teaching 
Success as measured by ratings. Thereare 
recommendations for further research. 


Shane, Harold G. ‘‘Seven Types of Teacher 
Appraisal, ’’ Nations Schools, L (July 1952), 


a survey of ‘‘thirty-five outstanding 
school districts” in various parts of the 
United States, comes this digest of findings 
relating to teacher appraisal. Seven types 
of evaluation were in use and were reported 


in the following frequencies: 


Systems 

Rating Scale or Check-list 8 
Written reports following 

classroom visits 
No formal rating plan 15 
Self-appraisal form prepared 

by the teacher 10 
Verbal reports, principal to 

central offices 16 
Subjective appraisal by 

superintendent 10 


Group evaluations by teacher's 
fellow workers 

There were 70 responses (above) from 35 
districts, so it is obvious that most systems 
used, on the average, at least two of these 
methods. Most of the administrators sam- 
pled felt that teachers should not be merit- 
rated to determine progress upward on the 
Salary schedule, but in an apparently con- 
tradictory stand, five out of six superintend- 


WATTERS 


88. 


89. 


90. 


365 


ents felt that teachers should be dismissed 
when they were evaluated and found to be de- 
ficient in ability or personality. “This 
seems to be a common-sense attitude 
rather than an inconsistency. It is absolute- 
ly necessary in the interest of the children 
for the administrator to discharge persons 
who are ineffectual in the classroom. It is 
not, however, necessary to establish de- 
grees of excellence among capable teach- 
ers," 


Simpson, Ray H., and others. ‘‘A Study of 
Resourcefulness in Attacking Professional 
Problems, " School Review, LX (December 
1952), pp. 535-40. 

**Resourcefulness'' was measured by 
the number and quality of suggested solu- 
tions or responses to a hypothetical prob- 
lem situation in a classroom. It was found 
that certain groups of teachers classified 
according to grade levels and subjects taught 
were significantly different from others. 


Spalding, Willard B. ‘‘New York's Unwise 
Plan of Recognizing Merit in Teacher's 
Salaries, " American School Board Journal, 
CXIX (July 1949), pp. 21-3. 

An argument against the principle of mer- 
it salaries, and a description of the ‘‘spec- 
ious’’ reasoning which, it is said, under- 
lies such attempts as the 1947 New York 
law. This article is in part a critical re- 
sponse to one by Francis T. Spaulding, in 
the Phi Beta Kappan of July 1947, explain- 
ing and praising the law. 


Symonds, Percival M. ‘‘Reflections on Ob- 
servations of Teachers, ’’ Journal of Educa- 
tional Research, LXHI (May 1950), pp. 688- 
96. 

The author observed 24 teachers prepar- 
atory to a study on the relation betweenthe 
personality of the teacher, the mode of 
teaching, and the pupil response intheclass- 
room. Comments and insights resulting 
from those observations are presented. The 
variation and complexity of personalities 
was great and it appeared that there was no 
one ‘‘best’’ type of teacher. Almost all 
types of personalities were found among 
successful teachers, and the accepted be- 
lief that only normal, well adjusted persons 
should be teachers seemed not to hold, for 
some of the successful teachers observed 
were definitely neurotic and their neuroti- 
cism contributed to their success as teach- 
ers. There were some general character- 
istics of the successful teacher that seemed 
to cut across all the variations in person- 
ality. They were all more or less secure 


366 


JOURNAL OF EXPERIMENTAL EDUCATION 


ident; they were interested and 
aie en and Eins, and they were able to , 
accept them. Sincerity Seemed an import 
ant factor. Another interesting obser va- 
tion: ‘‘In general the school takes on the 
color and mood and atmosphere of those in 
administrative authority." «So faras they 
can, the teachers will also play roles sim- 
ilar to those set by the principalas a pat- 
tern. The principal of a school s ets the 
pace for the whole school—tea c hers and 
pupils. " 


91.***Teachers Ask End of Rating, " Scholastic, 


92 


93. 


LVIII (February 7, 1951), p. 5t 


. "Teacher's Marks Abolished: New Zealand 


Plan, " Times Educational Supplement, No. 
1945 (August 8, 1952), pp. 667. 

That the old System of numerical grad- 
ing of teachers beabolished and be replaced 
by a system embodying promotion lists, was 
ἃ recommendation reported inthis news 
article. Other information about the old 


and the Proposed new planof teacher apprais- 
alin New Zealand, 


Tiedeman, David V. “Ῥαγ for Teac hing," 
Harvard Educational Review, XXII (Spring 
1952), pp. 977-112. 

Neither the **preparational^ nor the **po- 
Sitional’’ Salary schedule accomplishes the 


ures of Preparation; the “positional”? type 
refers to the kind of positi 


a merit-type 
» involving: 1, 


inconsistent, Since 
€parationa] type 
salary schedule which must be 


g Proved, and 
not that of the merit type, . -- if the assump- 


94, 


95. 


96. 


(Vol, XXII 


tions are accepted that the purpose of pay- 
ing teachers is to reward them for >. 
ing children, and that the proficiency wit 
which they do this is not the same for all 
teachers.... The most insidious argument 
against the merit-type salary schedule is 3 
that it destroys the morale of a school sys- 
tem." Facts are listed, then ‘‘these ban 
ments are indeed powerful ones. Certain y, 
if the Solidity, mutual respect, and A 
ativeness of a faculty is disintegrated be 
cause of the adaption of a merit-type salary 
schedule, the plan should be discontinued." 
Again, p. 108, .*«the proposal for ig ag 
tion of a merit-type salary schedule is no 
likely to come from teachers. ” 


Tompkins, Ellsworth and Armstrong, We 
Earl, “Teacher Ratings: Persistent Dilem 
ma, ’’ National Association of Second πα 
School Principals Bulletin, XXXV (May 1.951), 
pp. 25-31. 

In this review of the subject of μα ος 
evaluation, the authors report pa irem 
from a study of the literature: 1. There 1 t 
a great deal of material printed, for, against, 
and in-between, 9. Earlier publications í 
emphasize technique and manner, while a 
late there is more question raised as to ae 
desirability of rating. 3. There is nS, 
ing mention of the complex factors involv le 

: There is growing concern for the mora 
factor, the effect of rating on teachers. l- 
Increasing tendency to recognize eer dies 
uation and Cooperative group evaluation ἃ 
more productive of results. 6, There 1S t 
Still a great deal of bias and disagreement, 
not likely to decrease, 


Trump, J. Lloyd. «Merit Rating Puts the 
Cart Before the Horse, ’’ Nations Schools, 
XLV (June 1950), pp. 51-3. i 
Arguments for and against merit r da 
as they appear to the Board of Education 1 
member, the Superintendent, the principa^ 
and the teacher, When working conditions 
are Considered in relation to the function © 
the teacher, it is Seen that working condi r 
tions effect the efficiency of teaching. yar 
1015 phases of the **job of the teacher” Ed 
d in each case it is demonstri- 
ε f the important factors in bine 
lent teaching is the condition under which e- 
teaching is carried out. “It would seem @ 
Sirable to seek first the correction of eel 
ditions under whicn teachers frequertly 


oe before placing emphasis on evalua~ 
ion. ^» 


; 


Unzicker, Samuel P, «Ends and Means in 


Super vision, » Educational Administration 
Educational Administration 


June, 1954) 


and Supervision, XXXVI (November 1950), 
pp. 385-95, 

Supervision as a two-way process; dem- 
ocratic, non-authoritarian supervision. 
The second part of the article raises the 
question, ‘‘What part does appraisal, eval- 
uation, ‘rating’, have in this process?" 
**One way for a teacher to be brought to ac- 
cept appraisal by another might be to help 
to plan the items in any rating form, or at 
least to fully accept their implications. ” 


97. Vander Werf, Lester. ‘‘The Trouble With 
Rating Systems, ’’ American School Board 
Journal, CXXV (August 1952), pp. 17-8. 

The author is sceptical of rating systems 
because they involve the following assump- 
tions *'all partly or completely false. "' 

1. That teaching can be accurately meas- 
ured; 2. that administrators can be objec- 
tive in their judgments; 3. that individual 
competitive situations encourage competence 
and high morale; 4. that teaching staffs lie 
on the curve of normal distribution. Sug- 
gestion of a positive program of improve- 
ment of instruction rather than merit rating. 


98. West, Allan M. ‘‘The Case For and Against 
Merit Rating, '' School Executive, LXIX 


(June 1950), pp. 48-50. 


WATTERS 367 


Α statement of both sides of the case, 
for and against merit-type salary schedules. 


99. Witty, Paul. ‘‘Some Characteristics of the 
Effective Teacher, " Educational Adminis- 
tration and Supervision, XXXVI (April 
1950), pp. 193-208. 

Describes the results obtained from an 
analysis of 14, 000 letters sent in as entrees 
for an essay contest sponsored by the **Quiz 
Kids” radio show, on the topic ‘‘The Teach- 
er Who Has Helped Me the Most.’’ Twelve 
of the most often mentioned characteristics 
in the order of the frequency mentioned are: 

1. Cooperative, democratic attitude 

2. Kindness and considerateness 

3. Patience 

4. Wide interests 

5. Pleasing personal appearance and 

manner 

6. Fairness and impartiality 

7. Sense of humor 

8. Good disposition and consistent 

behavior 

9. Interest in pupils’ problems 
10. Flexibility 
11. Use of recognition and praise 
12. Unusual proficiency in subject 

A list of a dozen negative characteristics, 

things which pupils do not like, is also pre- 

sented. 


INDEX AND CLASSIFICATION OF ANNOTATIONS 
(Figures refer to references, not pages) 


Approaches 1, 2, 3, 13, 30, 40, 47, 84. 

Check-lists 15, 20, 30, 32, 49, 54, 55. 

Community relations 3, 13, 30. 

Concepts 6, 9, 13. 

Criteria 5, 6, 9, 13, 15, 25, 27, 28, 31, 35, 51, 
54, 56, 58, 60, 73, 74, 82, 83, 84, 85, 90, 
99, 

Critical Incidents Technique 27, 28, 33, 47, 
83. 

Levels of efficiency 2, 35, 38, 39, 41, 43, 70, 
T1. 

Master teacher 2. 

Mental health 21. 

Merit-salary and merit-rating 2, 4, 7, 10, 12, 
13, 17, 18, 22, 23, 24, 38, 50, 61, 63, 65, 
66, 67, 68, 89, 93, 94, 98. 

Methods of appraisal 25, 31, 40, 57, 70, 71, 72, 
82, 84, 87, 92. 

New York Salary Law 7, 10, 11, 38, 66, 99. 

Organized Teacher's policy 17, 18, 24, 61. 

Participation 9, 10, 13, 38, 39, 53, 63, 65. 

Personality traits 15, 21, 30, 36, 46, 51, 56, 


60, 62, 73, 31, 83, 84, 86, 90, 99. 
Prediction 14, 16, 36, 46, 47, 51, 73. 
Preparational schedules 23, 77. 

Professional distance 41. 
Pupil growth 1, 3, 13, 20, 31, 56. 
Rating 1, 2, 4, 5, 9, 12, 13, 14, 20, 22, 23, 31, 

41, 43, 50, 53, 56, 63, 70, 71, 72, 78, 87, 

92, 94, 97. 

Research 1, 3, 5, 6, 14, 16, 21, 27, 28, 33, 36, 

40, 43, 46, 51, 52, 56, 57, 64, 70, 71, 72, 

82, 83, 84, 86, 87, 88. 

Self-appraisal 30, 32, 44, 49, 54, 55, 56. 
Summaries 1, 2, 5, 9, 13, 20, 30, 31, 37, 42, 

53, 58, 71, 80, 94. 

Supervisory use of appraisal 9, 15, 18, 20, 34, 

42, 44, 76, 96. 

Teacher Characteristics Study 47, 82, 83, 84. 

Teacher dismissal 48. 

Teacher efficiency definitions 3, 9, 56, 73, 85. 

Teacher-pupil relations 5, 13, 15, 21, 36, 40, 
52, 60, 62, 81, 83, 99. 

Teacher resourcefulness 88. 


3. 


INTELLIGENCE LEVELS AND CORRESPOND- 
ING INTEREST AREA CHOICES OF NINTH 
GRADE PUPILS IN THIRTEEN 
MICHIGAN SCHOOLS 


KENT W. LEACH 
University of Michigan 


WHAT ARE the first choices of interest 
areas for pupils of higher intelligence levels as 
Compared to selections made by pupils in the 
middle or lower intelligence groups? Numerous 
factors, including the types of tests usedto meas- 
ure intelligence and interest as well as the age 
ànd grade level of the pupils tested, would influ- 
ence such choices. This article presents data 
derived from scores made by 779 ninth grade 
boys and girls from thirteen Michigan high 
Schools on the New California Short-Form Test 
of Mental Maturity, Advanced, 1947, and the 
Kuder Preference Record, Vocational, Form CH. 
The data are presented in such a manner that 
the reacer can ascertain a pupil's first choice 
of interest area on the Kuder record and his cor- 
responding position on a five-group scale of in- 
telligence based on the California test scores. 

Table I shows that thirteen schools, all rela- 
tively small in enrollment, voluntarily partici- 
pated in this study. These schools sent in re- 
Sults for both the New California Short-Form 
Test of Mental Maturity, Advanced, 1947, and 
the Kuder Preference Record, Vocational, Form 
ΘΗ. 1 Using the Michigan High School Athletic 
Association system of classification, we see 
that no Class A schools took part in this study. 
Of the thirteen schools, two were Class B; eight, 
Class C; two, Class D; and one, Class E. Two 
of the schools are non-public; three are in met- 
ropolitan areas. 

Five intelligence levels or groups were se- 
lected arbitrarily by marking off division points 
on the Michigan profile sheet. This profile sheet 
depicts Michigan norms for the two tests usedin 
this survey. Table II shows the five groups with 
corresponding percentiles and raw scores for 
language, non-language, and total mental factors 
of the California test and similar raw scores for 
the Kuder record, The levels of intelligence in 
Table II show that group one represents the su- 
Perior group; group two, the above-average 
group; group three, the average group; group 


four, the below-average group; and group five 
the lowest intelligence group. 

One tablulation sheet for each οἱ the tenpref- 
erence areas was constructed. ‘T2dle III is an 
illustration of such a sheet showing the boys' 
data for the outdoor area. One can tell by ex- 
amining this table that in the non-language sec- 
tion, for example, in groupone, seven boys des- 
ignated the outdoor area as being the one of prime 
interesttothem. In the language sectionin group 
three, seventeen boys rated the outdoor area as 
the mostappealing of the ten interest areas. Table 
IV gives similar information for the girls. 

Another type of tabulation sheet was also need- 
ed. Table IV is an illustration of thisand shows 
how many pupils from a certain school listed a 
particular area as a first choice within each in- 
telligence group. 

By using the type of tabulation system as ex- 
plained in the foregoing paragraphs, summary 
sheets were constructed as illustrated by Tables 
V through X. For example, Table V lists the 
first choices of interest areas and correspond- 
ing intelligence group standings for 408 ninth 
grade girls from the thirteen Michigan schools. 

Examining Table V reveals that in the non- 
language section, of the girls who are inthe high- 
est intelligence group (Group I), twelve listed 
the artistic interest area as their first choice. 
In the language section, of the girls who are in 
the lowest intelligence group (Group V), four- 
teen chose as their first preference the clerical 
interest area. One can see, too, by examining 
the last row marked, ‘‘Total Number of Pupils,” 
that fifty girls listed the outdoor area as their 
first choice. Thus one can determine by exam- 
ining Table V how many girls of the 408 chose 
each interest area as a first choice, and how 
many girls of each of the five intelligence levels 
picked a particular interest area as a first 
choice. 

Interesting interpretations can be derivedby 
examining these tables. In Table VII inthe non- 


d. ini he 1950 California test and the Kuder (BB) form. Others gave the 1950 Cal- 
Some schools administered the 1950 Ca: fornia bos is now in the process of collecting data for those 


ifornia test and the Kuder (CH) form. T 
Schools administering the 1950 California test 


ĉe Class A, 800 or more pupils; 
than 75. 


and the Kuder (CH) form. 


Class B, 325-7993 Class C, 125-32h; Class D, Τ5-191; and Class E, less 


I 
10 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XX 
3 


TABLE I 


ENROLLMENTS OF SCHOOLS AND NUMBER OF NINTH 
GRADE PUPILS FROM THE THIRTEEN SCHOOLS 
PARTICIPATING IN THE STUDY OF INTELLI- 
GENCE AND INTEREST 


Number of Ninth Grade Pupils 


Enrollment ως ο ο unu 
School Grades 9-12 Boys Girls Total 
l L32 20 12 32 
2 62 D 9 14 
3 335 44 54 98 
4 119 18 17 35 
5 387 61 47 108 
6 185 20 23 43 
T 298 28 33 61 
8 302 45 903 98 
9 311 40 46 86 
10 283 25 34 59 
11 238 29 28 57 


LEACH 


June, 1954) 


371 


SINTOd WHODS NVU 


9314196 
qoires 


02139) 


pny 


jouory 
-Djnduio? 
HƏ W3O1—1VNOILV20A—Q30233 33Ν3834384 YWIGNH 


mushy 


-UON 
sap 


109Ρ1ΠΟ ||WUOI-S Q32NVAQV| -1υ95194 
ALIYNLYW TVLIN3IW 
ΥΙΝΣΟΑΙΊΥΘ A3N 


9Αι5Όπ5194 | 21u2:26 


-UDQ22W 


SHTITUNAOUEd ONIGNOdSXU5SOD0 ANY SINIOd Zuoos MVU 
ONIMOHS SWHON Παγαο HININ NVOIHOIM NO Q3SY8 SONSOITISINI JO SIZATI 


II WTISYL 


A € not) 


AI danoi 


III dnog 


SITAATTI 


3O 


SONSDITISULNI 


372 


JOURNAL OF EXPERIMENTAL EDUCATION 


TABLE ITI 


NUMBER OF PUPILS N 13 MICHIGAN SCHOOLS SELECTING OUTDOOR INTEREST AREA 
OF KUDER PREFERENCE RECORD AS FIRST CHOICE, AND OROUP STANDINGS ON CALI- 
FORNIA MENTAL MATURITY TEST FOR 53 NINTH GRADZ Boys 


MENTAL MATURITY 
Groups in Terms of 


| 
Raw Score Points 


NON-LANGUAGE FACTORS 


LANGUAGE FACTORS 


GROUP V 
O - 48 


8 
fa 
i 
d 
5 
E 


NEW CALIFORNIA ADVANCED ALL 
‘U7 S-FORM TEST OF SCHOOLS SCHOOLS 


(Vol. XXII 


TOTAL NO. 
OF PUPILS 


June, 1954) LEACH 


TABLE IV 


NUMBER OF PUPILS IN SCHOOL NO. 1 AND THEIR CORRESPONDING FIRST CHOICES OF 
INTEREST AREAS OF THE KUDER PREFERENCE RECORD, AND GROUP STANDINGS ON THE 
CALIFORNIA MENTAL MATURITY TEST FOR 12 NINTH GRADE GIRLS 


NEW CALIFORNIA ADVANCED) 
'h? S-FORM TEST OF 
MENTAL MATURITY 


KUDER PREFERENCE RECORD - VOCATIONAL - FORM CH 
Groups in Terms of Out-|Mech- | Compu- |Scien4 Persua-| Art- |lit- |Mus-|Soc. | Cleri- 
Raw Score Points door] anicalj tational] tific | sive istic|erary]ical|Serv.| cal 
τὲ LIS 
Ppa taf ff] 5 
ἘΝ μαμα jaf | it fot s 
36 - 38 


ROUP III i 
32 - 35 


ο - 27 


OTAL NUMBER 
Quart και 


GROUP I 
33 or more 


GROUP II 


NON-LANGUAGE FACTORS 


LANGUAGE FACTORS 


GROUP V 
ο - 19 
OTAL NUMB 
Ν 53 PUPILS 


TOTAL MENTAL FACTORS 


er fet} E 

28 - 31 

BP μισώ 
spate Pets fap ofaf of >] 


TOTAL 
NO.OF 
PUPILS 


e Iw 


E 


"o 


373 


JOURNAL OF EXPERIMENTAL EDUCATION 
374 


ion, 103 pupils of the 779 selected 
pepe dien ος area as their first Choice. 
These 103 pupils are distributed through the in- 
telligence groups as follows: In groupone, there 
are ten pupils; group two, twenty; group three, 
thirty-one; group four, twenty-three; and group 
five, nineteen. This grouping changed somewhat 
in the language area. For example, fifteen pu- 
pils fell into the first intelligence level; twelve 
pupils in group two; thirty in group three; sey- 


into the other three levels, 
Table x depicts the Same type of relationship 
expressed in percentage of pupils rather than in 


Of the 103 Pupils who picked 
the outdoor area as the first Choice, 32 percent 


are in group three on the total mental factors 
i Many other interpretations can be glean- 


ed from an analysis of the data in Table Χ. For 
example, in the non- 


three; and the highe 
tic, group one, 

In other words, those pup: 
on the non-language section 


the non-language group falli 


the highest percentage is in ÉroUp one, As far 
as the interest areas per se, they rank 


as fol- 
lows 

1. Outdoor 13. 2% 103 . 
2. Mechanical 12.29, 5 wi 
3. Clerical 11. 7% pee 
4, Artistic 11. 3% ques 
5. Musical 10. 9% 5 pn 


(Vol. XXII 
6. Social Service 9. 5% 74 τοι 
7. Computational 9.2% 72 p mm 
8. Literary 8. 6% 67 ee 
9. Science 7.1% 55 Lm e 
10. Persuasive 6. 3% 49 pup 


However, in order to ascertain which arin 
gence group is responsible for making a ead 
ular interest area the popular area, an e AXI. 
ation should be made of Tables XI, XII, Ὕλην 
Tables XI and XII are summaries for the τα 
and boys respectively; Table XIII is a Lor hid 
Summary. An example should be used ae XL. 
time to facilitate an interpretation of Ta. uldéor 
As has been pointed out previously, the o tha 
area ranks first in popularity ke ona oe 
total number of pupils making choices. iiy choos" 
highest percentage of those pupils actua ul group 
ing this area are in the middle cage ae 
on the non-language, language, and tota 150 shows 
factors. An examination of Table XIII a tation“ 
that of the total number picking the omP ρα” 
alarea, the highest percentage on the cag ups 
uage section fall in the highest intelligen' telli- 

The actual number of pupils in each τν Ben 
Bence group level is distributed among t 
tions of the California test as follows: 


Non-Language Language 
ils 
Group 1 168 pupils Group 1 te P upils 
Group2 133 pupils Group 2 12 P piis 
GrOUD3 198 pupils Group 3 nrbes 
Group4 154 pupils Group 4 oe Pupils 
Group 5 126 Pupils Group 5 “8 


Total Mental Factors 
— Mental Factors 


Group 1 166 pupils 

Group 2 129 pupils 

Group3 238 pupils 

Group 4 118 pupils 

Group5 128 pupils b 

à her 
At first glance, one might say that eg 
are more pupils falling within the average 


ore 
cence groups, there will be, therefore, M gin 
Pupils in each of the 

group three, 
example, Tabl. 


e 
ranking second. For inthe 
€ number of pupils taling , 
t leve] or intelligence on the 13 er fall 
Part had Only one less Pupil than the numb 


June, 1954) LEACH 


TABLE V 


FIRST CHOICES OF KUDER INTEREST AREAS AND GROUP STANDINGS ON THE CALIFORNIA 
MENTAL MATURITY TEST FOR 408 NINTH GRADE GIRLS FROM 13 MICHIGAN SCHOOLS -- 
EXPRESSED IN TERIS OF NUMBER OF PUPILS 


NE ; N i 

147 S-FORM TEST OF 

MENTAL MATURITY 
Groups in Terms of 
Βαν Score Points _ 


NON-LANGUAGE FACTORS 


Β 
B 
D 
51 
8 
E 
E 


TOTAL MENTAL FACTORS | 


GROUP I 
39 or more 


GROUP II 
36 - 38 
GROUP III 
32 - 35 
GROUP IV 


GROUP V 


FREIE! 
MEANA 


GROUP I 


33 or more 


Js 
EInRHBHHHN 
ESERFI 


ESEMEIBEID 
si fie | nun 


375 


376 


JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXII 


FIRST CHOICES OF KUDZZ INTEREST Aas AID GROUP STANDINGS ON TE 
CALIFORNIA FOR 371 NINTH SADE ROYS PRON 13 


IN TERS or NUMBER OF PUPILS 


a I 
NEW CALIFORNIA ADVANCED 

147 S-FORM TEST OF 
MENTAL MATURITY 
Groups in Terns of 
Raw Score Points 


KUDER PREFERENCE RECORD ~ VOCATIONAL -- FORM CH 


Mech- | Compu- 
door janical tational 


LANGUAGE FACTORS 
$ 
a 
H 
< 


TENE 
OUP I 
Ü or more 

II 


3 Ποπ 

& 6-6 

eo 

S 

d 

H 

H 

Ecl 

ει 

& [TOTAL NUNBZR 


OF PUPILS 


Tune, 1954) LEACH 377 


TABLE VII 


FIRST CHOICES OF KUDER INTEREST AREAS AND GROUP STANDINGS ON THE CALIFORNIA 
MENTAL MATURITY TEST FOR 779 NINTH GRADE BOYS AND GIRLS FROM 13 MICHIGAN 
SCHOOLS - EXPRESSED IN TERMS OF NUMBER OF PUPILS 


NEW CA AVA 

147 S-FORM TEST OF 

MENTAL MATURITY 
Groups in Terms of 


Raw Score Points i a 
Fi a 
| : 


5 
17| 22] 25 


E 
= 
fa 
9 
E 
g 
i 
8 
x 


13 
15 


LANGUAGE FACTORS 


TOTAL MENTAL FACTORS | 


JOURNAL OF EXPERIMENTAL EDUCATION ( Vol. XXII 


TABLE VIII 


FIRST CHOICES OF KUDER INTEREST AREAS AND GROUP STANDINGS ON THE CALIFORNIA 
MENTAL MATURITY TEST FoR 408 NINTH GRADE GIRLS FROM 13 MICHIGAN SCHOOLS -- 
EXPRESSED IN TERMS Or PERCENTAGE OF PUPILS 


NEW CALIFORNIA ADVANCED 
'h? S-FORM TEST OF 

MENTAL MATURITY 
Groups in Terns of 
Raw Score Points 


KUDER PREFERENCE RECORD - VOCATIONAL - FORM CH 


bí Ompu- cien- Kri [3 : 
anical|tational|tific pete i 
poH 


(PERCENTAGES) 


GROUP II 
36 - 38 


NON-LANGUAGE FACTORS 


GROUP IV 
20 - 22 


LANGUAGE FACTORS 


GROUP I 
68 or more 
GROUP IT 
63 - 67 


TOTAL MENTAL FACTORS 


1ο - 8 


OTAL NUMBER 
OF PUPILS 


E 


Tune, 1954) LEACH 


TABLE IX 
FIRST CHOICES OF KUDER INTEREST AREAS AND GROUP STANDINGS ON THE CALIFORNIA 


MENTAL MATURITY TEST FOR 371 NINTH GRADE BOYS FROM 13 MICHIGAN SCHOOLS -- 
EXPRESSED IN TERMS OF PERCENTAGE OF PUPILS 


o XBVA 
"h? S-FORM TEST OF 
MENTAL MATURITY 
Groups in Terms of 
Raw Score Points _ 


| 36.7 

[und da 
baba fs ne [nd nab 
5 "m STET 
= 


NON-LANGUAGE FACTORS 


leale zala 


m ides us 


8 
B 
Β 
8 
E 
3 


va | 29 [nal 2a eal ol a 
- 


TOTAL MENTAL FACTORS 


ES , 


pa ful maa al sa 


379 


JOURNAL OF EXPERIMENTAL EDUCATION 


380 


TABLE X 


FORNIA 
ROM 13 MICHIGAN 


THE CALI 


N 


F] 


OF PUPIIS 


NDINGS O] 


UP STA! 
BOYS AND GIRLS 
PERCENT AG 


F 


INTEREST AREAS AND GRO 
SCHOOLS - EXPRESSED IN TENS O 


T FOR 779 NINTH GRADE 


FIRST CHOICES OF KUDER 
MENTAL MATURITY T 


KUDER PREFERENCE RECORD - VOCATIONAL - FORM CH 


A AD 
OF 


147 S-FORM TEST 


MENTAL MATUR 


ITY 


ORN 


EN 


1 
y 
ds 
o οἱ 
Ni 
2 bl 
oo 
nal 
ὦ di 
Sal] 
i] 
5 df 
m 
a pya 
4 opo 
Aaii 
1 
$ [ει 
P εν 
S ἔα 
Bee 
E Olle 
Q ἡ 
ERU 
LES 
5] 
τ 
6.3] 
ori 
"n 
o li 
23 
ΠΕ 
KE 
| 
d | g 
ul n a 
5 E g H | £ H 5 
EE HE Bolhe Bale HES A al Fale zZH 
» | & | e 
& gl BOR VE ey SERERE ERI Eis 
2 ῄ 9 3 
MEI 55 δε|85|85|8ο] Βο|Βς|8α|Ξ5| 8 [Ξε 
Βα] i 3 
δ 5] εποῖονα aovnonvi-non SHOLOVE 3OVnONYI ^ | ^ suozova TYINSK TYIOL 


June, 1954 
, ) LEACH - 


TABLE XI 


RANKS OF INTEREST AREAS AND CORRESPONDING INTELL. 
IGE. 
GROUPS FOR 408 NINTH GRADE GIRLS FROM 13 MICHIGAN SCHOOLS 


Intelligence Groups 


Interest Non-Language Language Total Mental 
Rank Areas Factors Factors Factors 
1.5 Musical IV m II 
1.5 Clerical IH m IV 
3 Outdoor Im II-IV Im 
4 Artistic I In IH 
5 Computational n IH II 
6 Mechanical I-HI-IV-V (tie) II II 
7 Social Service I-IV (tie) I-III (tie) IH 
8 Literary IV I I 
9 Science I-IV (tie) m IV 
10 Persuasive IH-IV (tie) n II 
TABLE XII 


D CORRESPONDING INTELLIGENCE 


RANKS OF INTEREST AREAS ΑΝ 
13 MICHIGAN SCHOOLS 


GROUPS FOR 371 NINTH GRADE BOYS FROM 


Intelligence Groups 


Interest Non-Language Language Total Mental 

Rank Areas Factors Factors Factors 
1 Outdoor IH p Ill-IV (tie) 
2 Mechanical I IH I 

3 Artistic I IH πι 

4 Clerical I IH Im 

5 Musical I IH πι 

6 Social Service π IH III 

7 Persuasive Il m IH 

8 Science I I I 

9.5 Computational I 1 1 

9.5 Literary IH I IH 

TABLE XIII 


RANKS OF INTEREST AREAS AND CORRESPONDING INTELLIGENCE GROUPS 
FOR 779 NINTH GRADE BOYS AND GIRLS FROM 13 MICHIGAN SCHOOLS 


Intelligence Groups 
Interest Non-Language Language Total Mental 

Rank Areas Factors Factors Factors 

1 Outdoor IH nt m 

2 Mechanical I IH n 

3 Clerical IH IH Hn 

4 Artistic I m n 

5 Musical IV IH IH 

6 Social Service m IH IH 

i Computational I m T 

8 Literary m H II-IH (tie) 

9 Science m I I 
10 persuasive IH τα IH 


382 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXII 
TABLE XIV 
INTEREST AREAS IN RELATION TO INTELLIGENCE 
35 — i 
2h Outdoor 
33 
32 
31 
30 
29 
28 Ms. 
ο . 
27 Outdoor zr 
26 Mách. 
B 25 SocjServ. Mech. 
2l 
8 » 
35 29 
n 2 Cler. Conpt. 
δὲ 20 Peys, Ad 
3 19 
= . 
ΒΒ 18 Codpt. (ue 
f Ex 
58 17 
ΒΞ 16 (Cldr.) Outdoor Lit. í 3 ὶ 
pt (Medh. Cller. 
&g 15 : ) ( Mus.) Outdoor) 
Mech. di. 
RS 1 (te 8 ss loss 
ΒΞ 13 ( Mus. ) 
BB» (Compt. ) (Li i 
5 u (Pdrs. ) (Sd. 
(5ο4.56τν.) 
ü 10 Soc.Berv, 
fo" g . 
& 8 Copt . 
B 7 
B 6 Pers. 
5 
Pers. 
4 
3 Sai. 
2 
1 
6 
5 "Wes Y P c 9 LD σ oD 0 36 do 
Toup ΤΥ S up 
Group IIT Group II E Group I 


PERCENT ~NEW i 
C LIFORNIA MENTAL Ἡ , 
'ENTILES. NEW CA N {ATURITY TEST ADVANCED, 1947 5 FORM 


τσ 


June, 1954) 


ing in the average group. 

It is interesting to plot the number of first 
Choices of interest areas by pupils against the 
intelligence groups in which the choices were 
made. By so doing, we arrive at the graph in 
Table XIV. This table shows at each level of in- 
telligence the popularity ranking of the interest 
areas. In group one, the mechanical area is 
first; group two, literature and art are tied for 
first; group three, the outdoor area is first; 
group four, the clerical area was at the top; 
and in group five, the list is headed by the out- 
door area. 

The graph can also be read in this manner: 
Twenty-one pupils in group four signified that 
their first choice of interest area is clerical. In 
group two, nine of the pupils picked science as 
their first choice, althoughfifteen pupils picked 
the mechanical area in that same group. 

In summary, these conclusions merit atten- 


tion: 


1. In all intelligence groups, the persuasive 
interest ranks relatively low in popularity. 

2. In all intelligence groups, the sciencearea 
also ranks relatively low in popularity. 

3. The social service area is relatively con- 
sistent as far as its standing in each ofthe 


LEACH 383 


five intelligence groups. With the excep- 
tion of its place in the superior group where 
its rank is next to last, it is near the mid- 
dle of the rankings. 

4. The outdoor area ranks highest in interest 
in the lowest and average intelligence 
groups and second in the below-average 
group. The highest intelligence group, 
however, places the outdoor area toward 
the middle of the list. 

5. The computational area received its high- 
est position in the superior intelligence 
group. 

6. Literature and art received their highest 
rankings in the above-average intelligence 
group. 

7. The clerical area received first place in 
the below-average intelligence group and 
ranked well up the scale in the average as 
well as the lowest intelligence group. 


It is true that it is dangerous to draw such 
conclusions on the basis of limited data. Never- 
theless, it is rather interesting tosee such rela- 
tionships involving 779 ninth grade Kuder and 
California test scores. Additional data gathered 
undoubtedly will clarify the conclusions brought 


forth in this article. 


OUTCOMES OF LECTURE AND DISCUSSION 
PROCEDURES IN THREE COLLEGE 
COURSES 


HARRY RUJA 
San Diego State College 
San Diego, California 


SECTION I 
INTRODUCTION 
Statement of the Problem 


THE INVESTIGATION here reported 
has sought to measure and contrast some out- 
comes of lecture and discussion methods in col- 
lege teaching. Since the cultivationof emotional 
and social values as well as of intellectual val- 
ues has come to be included among the import- 
ant objectives of the schools, the present study 
has sought to assess these outcomes as well as 
the intellectual outcomes of the two methods. 


Definitions 


By ‘‘lecture’’ is meant continuous discourse 
by the instructor for purposes.of instruction. 
By ''discussion'' is meant interchange of ques- 
tion and answer among students primarily, with 
the instructor playing a role chiefly of moder- 
ator. The instructor roughly defines the area 
of discussion and supplies information when 
directly asked for it or when it illustrates a point 
already made or when it poses a question rele- 
vant to the topic under consideration—but only 
if this material is not readily available other- 
wise. Frequently, when he is asked a question, 
he will turn it back to the student who has asked 
it, or to other students in the class. Mostly the 
activity of the instructor consists in reflecting 
the content and feelings of students' comments 
and questions, relating these to one another and 
to a central topic, and promoting orderly se- 
quences of discussion. He does not correct or 
Confirm student judgments (or even misstate- 
ments of fact) but rather accepts and reflects 
hem ina ‘nondirective’’ manner. For fuller 
descriptions of this procedure and point of view, 
See references 1, 3, 11, and 12 at end of this 


Article, 


Design of the Experiment 


The design of the experiment was as follows: 


Fall, 1951-1952: 
Philosophy 1A (Introduction to Philosophy) 
lecture 
Philosophy 1B (Introduction to Philosophy) 
discussion 
Psychology 1 (General Psychology) discus- 
Sion 


Spring, 1951-1952: 
Philosophy 1A, discussion 
Philosophy 1B, lecture 
Psychology 1, lecture 


Schedules of assignments and discussion- 
topics, prepared in advance of each semester 
and followed fairly faithfully, enabled the in- 
structor to devote about the same amount of time 
to a topic in both lecture and discussion sections 
ofa given class. These schedules were distrib- 
uted (or, in the philosophy classes, dictated) to 
students at the first meeting. Thesameassign- 
ments were made to both lecture and discussion 
sections, the same textual materials used, and 
the same examinations administered. The writ- 
er served as instructor in all the classes com- 
prising the experimental groups of the present 
Study. The classes were part of the regular 
offering of San Diego State College, and stud- 
ents selected them in a manner no differentfrom 
that in which they might enroll for any other 
such classes. The classes were not identified 
inany way as experimental classes; the instruc- 
tor did not announce to his students that they 
were subjects in an experiment, 


Hypotheses 


The experiment may be thoughtofas seeking 
to assess the validity of four hypotheses; viz. , 


s i Dis- 
. - . iter's unpublished Master's thesis, Outcomes of Lecture and 
* This paper is an abridgment of the age Eom State College, 1952. 


Cussion Procedures in College Teac » 


A Turner 
E ip . Manfred He Schrupp, Wolcott C. Treat, and Merle Be 2 
“the writer expresses SE ee, quu BOLI in the design, execution, and reporting of the research 


of San Diego State College, 
Teported heres 


386 


students in discussion classes in comparison 
with students in lecture classes: 

a. Show greater subject-matter mastery, as 
measured by course examinations; . 

b. Exhibit greater gains in emotional adjust- 
ment at the end of the semester, as meas- 
ured by an adjustment inventory; 

c. Express more favorable attitudes toward 
their instructor, as measured byan in- 
Structor rating form; and 

d. Become better acquainted with their fellow- 
Students, as evidenced by knowledge of a 
greater number of names of fellow-stud- 
ents. 


SECTION II 


MEASURING INSTRUMENTS 


Examinations 


The examinations were all objective-type 
(multiple choice » consisting of from 32 to 94 
items, Four examinations were administered 
to each of the philosophy and five to each of the 
psychology classes, including a comprehensive 
final examination. The writer computed split- 
half coefficients of reliability for all examina- 
tions, correcting for length with the Spearman- 
Brown formula, They ranged from -56 to .91, 
for the most part increasing successively during 
the semester, 

Except for Philosophy 1B (where the textual 
material was somewhat Scanty), the examina- 
tions covered only textual material, 
ents were informed early in the seme 


i was a just-published Standard text 1 
psychology (9). The textual material 
osophy classes consisted of mimeogr; 
labi prepared by the instructor comp 


in the phil- 
aphed syl- 
rising an 


ing list. 
inations covered both textual material. 
additional material presented in class, 


, 
student discussion; occasi 
supplied information when 
quired by the discussion a. 
Seem able to supply it themselves, 

Since, presumably, Students in both lecture 
and discussion groups had equal access to the 
textual material, differentia] performance onthe 
tests presumably reflects differentia] effective- 
ness of the two methods of teachi E 

The examinations were Prepared in the fall 


JOURNAL OF EXPERIMENTAL EDUCATION 


in advance so that the instructor might not be 
biased more for the spring students by his " 
knowledge of the test-content than for the fa 
students. . 
Item analyses of each examination after i ue 
first administration were made and items ade 
ing inconsistent with the final test as a ion 
were not included for scoring purposes. T : 
final examinations for the most part consiste 
of items from the mid-terms which had κ... 
themselves (in the fall) to be most Sid 
ing. The students were informed cay re an 
be the case and had opportunity to review t 
mid-terms. --- 
The examinations sought to test for signi s 
cant factual knowledge as well (as far as pon 
ble) as for understanding and reasoning abi v 
The semester grade was determined not a 
by the scores on the examinations, but also ed 
papers that students prepared. Points yan, m 
on the papers were added to the scores on 
tests to constitute the “performance’’ score. 


Bell Adjustment Inventory 


This inventory, developed by Hugh M. Ben 
of Chico State College, consists of 160 ite pe 
designed to measure home, health, sopia y yal- 
emotional adjustment., High coefficients 0 chool 
idity (. 72 to . 90) and of reliability (. 93) inset 
Situations are reported by Bell for the Miis in- 
tory (2). Although the validity of pera y 
ventories in general as measures of emotio ars 
adjustment has been questioned in recent ye oom 
consideration of the exigencies of the εσας 
Situation led to the selection of the Bell as se 
most adaptable and appropriate to the dein the 
at hand. The inventory was administereda lass 
beginning and end of the semester to each ¢ 
involved in the present study. 


Instructor-Rati Form 
hating Form 


Using the Likert technique (5), the writer 
developed a 90-item instructor-rating form ο” 
designed to tap, especially, the students’ em h 
tional attitudes toward the instructor. α 
the “best” instructor, in terms of e eu 
long-run changes in his students, is not neces” 
Sarily the most Popular, yet it was thought ie 
Sirable to measure students’ emotional toons 
towards their instructor as possibly diagnosti 
Οἱ their attitude towards the subject-matter 


a the educational enterprise in gen” 
eral. 


A Coefficien 


t of reliability for the scale; 
mputed by 


f^ 
the split-half method and correc 
ed by the Spearman-Brown formula was found 
tobe .969. A Coefficient of validity was se 
Cured as follows: 


Students in two Classes in Educational PSY~ 


(Vol. XXI 


E 


June, 1954) 


chclogy (N about 75) were asked to rate two in- 
Structors of their own choosing, one of whom 
they admired, the other whom they disliked. 
Numbered answer sheets were distributed and 
the students instructed to use the even-number- 
ed answer sheets for rating the admired instruc- 
tor, the odd-numbered answer sheets for rating 
the disliked instructor. Neither the name of the 
rater nor of the ratee was to appear on the ans- 
wer sheets. The median test (7,8) for signifi- 
cance of the difference between the two distribu- 
tions gave the highly significant x? of 96.9 (P 4 
.001). Biserial r between scores and expressed 
attitude was found to be . 903. Samples of items 
in the form follow: 

He is likeable. 

He is sincere. 

He is at a mental plateau. 

He is devoid of personality. 


The students in each of the six classes in- 
volved in the present project had an opportunity 
at the end of the semester to rate the instruc- 
tor using this instructor-rating form. They 
were instructed to refrain from affixing their 
names to their rating forms. 


Free Comments 


After the final examination, each student was 
invited to write anonymously on paper provided 
“any comments you may wish to make on the 
course, the text, the method of instruction, the 
examinations, the instructor, or on any other 
matter you care to discuss.’’ A number did so, 
and their comments throw some light on the dyn- 
amics of the experiment, as we shall see later. 


Names Known 


Using a method described by Pressey a nd 
Hanna (10), the writer asked each student at the 
end of the semester to list the names of all his 
fellow classmates whose names he knew. He was 
asked also to designate those with whom during 
the semester he had associated outside of class 
Approximately 20 hours or more (disregarding | 
association in other classes). In effect, the writ- 
er was seeking to determine whether one class- 
room procedure was more likely than the other 
to facilitate students’ becoming acquainted with 
One another, as indicated by knowledge of one 
anothers’ names and by their seeking out occas~ 
ions for association outside of class. 


Other Data 


Additional data were secured (as far as pos- 
SiUle) regarding each student's age, score on. 

e American Council on Education Psycholog " 
1681 Examination, and college status (as meas 


RUJA 


387 


ured by number of college units earned). These 
data were obtained from the college registrar; 
it was unnecessary to ask the students for them. 
Sex of each student was also recorded. 


SECTION III 
RESULTS 


Performance Outcomes 


Analyses of covariance were performed to 
ascertain the significance of the differences in 
performance scores when aptitude as measured 
by the ACE was held constant. The procedure 
followed was that described by Johnson (4), with 
some additions from McNemar (6). Only in the 
Psychology 1 sections did a significant differ- 
ence in performance appear when ACE scores 
were kept constant. The uncorrected difference 
in favor of the lecture section was 24 points. 
This difference is significant at better than the 
one percent level of confidence. The coeffic- 
ient of correlation (Pearson r) between the ACE 
scores of these Psychology 1 students andtheir 
performance scores is . 485. This coefficient 
represents a variance of . 232. Hence, 23.2 
percent of the variance in the performance 
scores is accountable for in terms of the ACE 
scores. When ACE was held constant, there 
was a difference in performance of 19. 9 points. 
This difference too is significant at better than 
the one percent level of confidence. (The F ra- 
tio of the ‘‘within’’ to the ‘‘between’’ variance 
is 9.49.) 

In both philosophy courses, the differences, 
though not significant, favored the discussion 
sections. The lecture sections in both cases 
had greater scholastic aptitude (as measured by 
the ACE), but even when this disadvantage was 
compensated for by the analysis-of-covariance 
technique, the differences in performance in 
favor of the discussion sections still failed to 


reach statistical significance. 


Bell Adjustment Inventory Changes 


Judging from the norms Bell provides for ir.- 
terpreting scores on his Inventory (2), a given 
score has a somewhat different significance for 
men than for women. ‘‘Average’’ scores for 
men run between 23 and 41, but for women 
they run between 25 and 47. A test of sig- 
nificance was therefore applied to the sex dis- 
tributions in all six classes. The formulas 
are taken from McNemar (6). In none of the 
philosophy classes did P reach . 10. In the Psy- 
chology classes, however, with a preponder- 
ance of women in the discussion section a nd 
men in the lecture section, P was less than .02. 


(Vol. XXI 


JOURNAL OF EXPERIMENTAL EDUCATION 


388 


L0 ' 


09 Ἵ 


TL 8L 


66 v9 


9 °LL 
T '99I 
T'98 


PII 


ὃς 


Ἱ A3oiousAsd 10i 
1ά490χ9 9550 YORE ui suonoos UOISSNOSIP IYJ IOAY] 990ΠΘαΦΤΠΡ OYL IIO 


08' 7 1037 10 7 d 


680 ` 68 8 Z0 “ST x 


————————————À 


I AsojoyoAsg qt Aydosoj1ug VI Áudoso[rtud ΘΙΠΞΤΌΟΙΝ 


T . οσο I IIA 


(I St esto yore ut ^q 'G) 
NOISSN OSIA SnSH3A AUNLOAT YOA INNOJA-DNLLVH 
-HO.LOfHLLSNI OL 4SNOdS3H NI AONAYAAAIA AO AONVOIAINDIS 


II 4'IGV.L 


I00'7 d ου” 7 d 


Ρ 


q 


ΙΟ Τὰᾶ 01'7d 


Ə 


E 


'Á10guoAug juaursnfpy [Tog aq] uo jueurjsn[p? ur ures v 5 19591491 oouologIIp snurui V :930N 


9II' - 589 '2 IET IS'g£- T9 
cec Ῥ- 8}; 688° - q0€ '€ ες Pel'S- PI 
680 ἹἹ- 88'I LP ε- q8h 'c P8'I ος τ- LZ 
ζι8᾽ - i p6 Ῥ TE 'I- ST 'I 6'I 8I°Z- 22 
ΟΡΙ ᾽ - GLT°T 11 Ῥ- plÀ'L ζ69᾽ ος τ- 6€ 
a I a a 

α-α ] 5 α Ν 1 S Cc, g 


9.Jn]9090'[ 'SA UOISSnOSI(T UOISSnoStH(T 9JIn3129'T 


NOISSN OSIA ANV ΠΒΠΙ,ΟΠΊ HOJA SHH00S LNHIALSQOfGQV TS 8 NI S9DNVHO 


I 4'IGV.L 


aT pue 
VI Áudoso[rud 


(uouioA) 
I AKSoJouoKsd 


(uou) 
I ASojouoKsd 


at Audosortud 


VI Áudoso[rud 


SSU[O 


ZÁ 


- 


June, 1954) 


Bell score changes then were computed for the 
men and for the women separately in the psy- 
chology classes. 

Table I reports all Bell score changes and 
their significance. The formulas are taken 
from McNemar (6). As the table shows, the 
Philosophy 1A students in both sections made 
highly significant gains (toward better adjust- 
ment). Both men and women in the Psychology 
1 lecture section made gains significant at the 
9 percent level. The men in the Psychology 1 
discussion section made gains significant at the 
10 percent level. None of the other changes 
reached this level of significance. 

When we contrast the changes in the lecture 
sections with the changes in the corresponding 
discussion sections, we find that not one of the 
differences in changes reaches the 10 percent 
level of significance. In all cases, the lecture 
sections showed greater gains (toward better 
adjustment) than the discussion sections. Al- 
though none of these differences is individually 
significant, the congruence of results all favor- 


ing the lecture sections bears consideration. By 


chance, one could expect such a four-fold con- 


gruence only once in 24 times; that is, the prob- 
ability of such an occurrence is . 06. This ap- 


proaches significance. 

Pooling the gains for the philosophy lecture 
sections and comparing them with the pooled 
gains of the philosophy discussion sections also 
gives a slight advantage to the lecture sections, 
but this too is not statistically significant. 


Instructor-Rating Form Results 


Table II exhibits chi squares and P's for the 
differences between the corresponding distribu- 
tions of response to the Instructor-Rating Form 
as computed by the median test (6, 7, 8). As the 


table shows, both philosophy discussion sections 


responded significantly more favorably to the 
instructor than the corresponding lecture sec- 

tions, No significant difference appeared inthe 
responses of the two psychology sections. Such 


` difference, however, as exists favors the lec- 


ture section. 


Some specific items were selected for evalu- 


ation when on inspection the differences in re- 
Sponse seemed large enough to favor a likeli- 


RUJA 389 


13. He is willing to accept new ideas. 

37. He gives students theféeli ng that they 
are part of the “teaching process, " not 
mere receivers of the teaching. 

47. He provokes lively discussions. 

98. He fidgets while talking. 


Those favoring the psychology lecture sec- 
tion are: 


99. He apparently puts many hours of prep- 
aration into his classes. 
75. His lectures are logical. 


Free Comments 


The writer examined the free student com- 
ments (secured at the end of the semester for 
each class in the manner described above) and 
labeled each as favorable, predominantly favor- 
able, unfavorable, and predominantly unfavor- 
able. Since the numbers were rather small, the 
favorable and predominantly favorable categor- 
ies were combined, as were also the unfavor- 
able and predominantly unfavorable categories. 
A sizeable proportion of students (40 percent in 
all) did not take advantage of this opportunity to 
comment. With regard to those who responded, 
the results show that the lecture sections are at 
no disadvantage as far as the distribution offree 
favorable comments is concerned. Two of the 
differences favor the lecture sections (Philoso- 
phy 1B and Psychology 1) and one the discussion 
section, but in no case does the difference be- 
tween lecture and discussion reach a five per- 
cent level of significance. In the Psychology 1 
classes, the difference favoring the lecture sec- 
tion is significant at the 10 percent level of con- 
fidence. 

The content of the comments may be signifi- 
cant far 2 better understanding of the dynamics 
of the two learning situations. Consideration 
of the content of the comments suggests these 
trends: 

Variations in response among students seem 
somewhat greater for the lecture sections than 
for the discussion sections. Thus one student 
in Philosophy 14 lecture wrote ‘‘I found this a 
very stimulating and worthwhile class.’’ An- 
other wrote: ''The lectures were complete and 
well-presented. 7 In contrast, a third in the 


ας i s used 
hood of significance. The median test wa same class wrote ‘‘The lectures seemed to tend 


in i E sual 
μας aa, '''" toward monotony at times. " Similarly, in the 
Two iem thus selected favor the lecture sec- Psychology 1 lecture, one student wrote "In- 
tion of Ps chology 1 at the 10 percent level of structor makes lectures too good to miss, 
ieniflomee or better. Two items favor the dis- while another wrote ‘‘The lectures were rather 
Cussion section of Philosophy 1A at the two per- E nd "o little to my understanding of 
ienifi better, and toíav- the subject matter. ”’ 
eie - ο... it Philosophy 1B at Those who responded favorably to the lecture 
the 5 coer es ] of significance or better. situation, expressed their appreciation in terms 
po ee of what they learned. In the discussion situa- 


Those favoring the discussion sections are: 


390 JOURNAL OF EXPERIMENTAL EDUCATION 


i emphasis seemed to fall more on the 
mM Re ihe student had received rather than 
on the information he had acquired. In the Phil- 
osophy 1A discussion, a student wrote ‘‘Iwalked 
out of this class feeling as though I had used 
what intellect I may have." Similarly, in the 
Psychology 1 discussion, a student commented, 
«1 thoroughly approve of the method of instruc- 
tion, as it stimulates the students to think more 
about the subject. "' Conversely, dissatisfaction 

in the discussion sections was expressed in 

terms of the instructor's failure to transmit in- 
formation. Wrote a student in the Philosophy 
1A discussion, «οι did not tell us enough. ” 

Another, “Αἱ times it was frustrating just to 

figure out exactly what you did feel or believe." 

A third, “You never voice an Opinion." Yet 

ironically, when, in the lecture sections, the 

instructor did express his views, he was criti- 
cised for doing so. In the Philosophy iA lecture, 

a student wrote, ‘The instructor is too biased 

in his opinions. He definitely teaches his own 

philosophy unerringly. ” 
Devaluations of oneself and of one’s fellow- 
students were frequently expressed in the dis- 
“We are just beginners ina 


decided, ” required «« 
There seemed mo i 


D Ore expla 
material [in the syllabus]. «q Personally pe. 

The instruct 
Should follow the test [sic ] book like theo s τ 
teachers do."' ‘The purpose and meaning of ex- 
periments [ should ] be given a f 


uller treatment 
in class discussion." ‘The exams are too 


Specific. ’’ 
Names Known 


The writer tabulated the number of names of 
fellow-students known at the end of the semes- 
ter by each student. Table III presents com - 
parisons of the number of names known by stu- a 
dents in lecture classes compared with the num 
ber in discussion classes, The table also prer 
sents data regarding the names of those with 
whom the student had associated outside of class 
approximately 20 hours or more. Names of " 
those known previous to-entering class were ex 
Cluded. The students also were to disregard 
time spent in other classes with a student when 
judging whether 20 hours or more had been 
Spent in his company. 

As the table shows, in every case, students 
in the discussion Sections knew more names 
than those in the corresponding lecture sections, 
at a level of significance better than . 001. For 
mulas used to compute C. R. and tare from 
McNemar (6). In addition, the students in ene 
Philosophy 1A discussion section had cultivate a 
intensive acquaintance with their fellow-studen? 
in significantly (P / , 001) greater numbers sa 
had those in the lecture sections. (P was no 
significant in the other courses. ) 

It should be added here that students in the 
lecture sections had the following opportunities 
to learn the names of their fellow students: 


a. Toward the end of the semester, each stu 
dent (present at the time) was informed of hi9 
Standing to date. On the hectographed sheet de 
Cording his grade were contained also the nam 
of all the other Students in the class. 

b. The instructor followed the practice of 
calling each student by name when recognizing 
him to Speak, 

€- Students were seated alphabetically at me 
beginning of the semester; the instructor calle 
their names aloud when he seated them. 

. There is no question, however, that students 
in the discussion Sections heard the names © id 
their fellow-memberg more frequently than di 
in vers of the lecture sections. In addition, 
in the course of the discussions, students wou" 
Ss their remarks to some pat 


ticular student and call him by name. 


Other Data 


Analysis was 
age and college 
tions of eache 
again used for 
ences, 


made of the distributions of — 
Status for the contrasting se¢ 
curse. The median test was _/ 
assessing significance of differ 


None of the differences reached a 5 percent 


(Vol. XXII 


June, 1954) RUJA 


TABLE III 


NAMES KNOWN AND INTENSIVE ACQUAINTANCESHIPS 


= 


Philosophy 1A 


Discussion 


Philosophy 1B Psychology 1 


Discussion} Lecture Discussion 


Lecture Lecture 


Names Known 


951 


7D (0r s?) 


C.R. (or t) 


M 

σ 

?p (or s?) 
M 


C. R. (or t) 


391 


899 JOURNAL OF EXPERIMENTAL EDUCATION 


or better level of significance except those for 
Psychology 1. Here the discussion students 
were significantly (P / .001) younger and had 
had less college work than the corresponding 
lecture students. Indeed, 87 percent of the Psy- 
chology 1 students had had no college work pre- 
viously whatsoever, in comparison with 30 per- 
cent in the Psychology 1 lecture section who had 
had no previous college work. Ninety-three per- 
cent of the Psychology 1 discussion students 
were under 20 years of age, compared with 49 
percent of the psychology lecture Students in 
that age group. In none of the other courses 


were the age and college-status differences so 
marked. 


SECTION IV 
DISCUSSION 


This study leaves unanswered a great many 


questions, Some of these contain implications 
for further research. 


Self-abnegation 
ability to achiev 
make a difference? 


Psychology discussion Section, th 
larger gain in emotional adjustment then at το 
women, while in the lecture Section it was the 
reverse. It might have been desirable to incor- 
porate the age, college Status, ang Sex variable 
into the analysis of covariance, or {ο B πόνο S 
propriate corrections in some Other manne; ο 
that these factors too may have been kept c ὦ n- 
Stant for performance comparisons. 

4. Are differences in socia] Climate import- 


ant? Does a fluid individualistic society make 
different educational demands from those which 
a stable, conformity-exacting society makes? 

5. What are the long-run effects of the two 
methods? Which leads to continued study? 
Which promotes the capacity to deal with new 
problems more? Which is more likely to moti- 
vate the student toward thorough mastery as eon 
trasted with superficial acquaintance and glib 
parrotting? Which cultivates more the habits 
of independent critical and creative thinking ? 
Compelling the student to rely upon his own 
powers may make him feel insecure if he is un- 
accustomed to this shouldering of responsibility 
fx unable to handle this responsibility. But if - 
this experience is a prelude to growth, thenper 
haps it is to be welcomed rather than censured. 

8. What outcomes ensue when the two meth- 
ods are combined? Does degree of nondirect- 
iveness in discussion make a difference? 


These are broad questions and would need to 
be translated into specific research issues be- 
fore progress could be made toward their solu- 
tion. They indicate, however, the complexity 
of the problem. Much work, itis perhaps 
needless to say, remains to be done inthisarea 
in order to identify the most effective proced- 
ures in college teaching. 


SECTION V 
SUMMARY 
,, The present study has sought to measure sig- 
nificant 


outcomes of two major teaching mee 
ures in college teaching. Specifically, it 5016 
to determine Whether there are significant dif- 
ferences (a) in mastery of subject-matter 2S 


measured by course examinations and assigned 
Papers, (b) 


ured by the 


ed by free Student-comments at the end of the E 
Semester, and (d) in Social adjustment as meas 
ured by extent of knowledge of names of fellow- 
members ang extent of acquaintanceship with e 
fellow-members for periods of 20 hours or mor 
during the Course of the semester, 


Hypotheses 


w Four hypotheses were set up for evaluation. 
€ may consider each hypothesis now in turn. 


a. The first hypothesis is not supported. The 
- pporte z 
Students in the discussion sections did not Bur 
Pa those in the lecture sections in subj BRE 
matter mastery, Although the philosophy St" 


(Vol. XXI 


June, 1954) 


ents in the discussion sections did equally well 
with those in the lecture sections, the psychol- 
ogy discussion students performed at a signifi- 
cantly lower level than the psychology students. 

b. The second hypothesis is not supported. 
Use of the Bell Adjustment Inventory showed no 
Significant differences in adjustment gains for 
discussion as compared with lecture. Rather, 
for each of the comparisons made (Philosophy 
1A and 1B, Psychology 1 men and Psychology 1 
women), the lecture sections made greater gains 
than the corresponding discussion sections; hence 
there is some strength to the contrary hypothesis 
that the lecture method promotes emotional ad- 
justment, as measured by responses to the Bell 
Inventory. 

c. The third hypothesis is supported in part. 
The instructor was rated more favorably by the 
philosophy discussion sections than by the cor- 
responding lecture sections. However, no sig- 
nificant difference appeared for the psychology 
classes. 

The discussion sections felt that the instruc- 
tor was willing to accept new ideas, that he gave 
students the feeling that they were part of the 
“teaching process’’ and not mere receivers of 
the teaching, and that he provoked lively dis- 
cussions; moreover, they were less conscious 
of nervous mannerisms which the instructor ex- 
hibits as he speaks. Those who favored the lec- 
ture procedure were impressed by theamount 
of time the instructor apparently put in in prep- 
aration for his classes and by the logical and 
systematic quality of his lectures. 


The students in the lecture sections stressed 
the classroom situation as providing opportun- 
ities for learning new ideas. The students in 
the discussion sections seemed more likely to 
blame themselves for failure than the instruc- 
tor, the examinations, or the textbook. On the 


RUJA 393 


other hand, they missed the sure direction and 
definite assurances which an instructor-center- 
ed class can provide. 

d. The fourth hypothesis is supported. As is 
perhaps to be expected, students in the discus- 
sion sections got to know one another in greater 
numbers than in the lecture sections. This is, 
perhaps, however, only a superficial acquaint- 
ance as measured by knowledge of one another's 
names. Only for the Philosophy 1A students was 
the difference between the number of those inthe 
discussion section who spent 20 hours or more 
in one anothers’ company and the number of those 
in the lecture section who did the same signifi- 
cantly different. Perhaps a one-semester class 
which meets but two or three times per week 
cannot be expected to do more in this regard 
than to acquaint students with one another. This 
there is no doubt, the discussion room can do 
much better than the lecture hall. 

In short, lecture proved superior in subject- 
matter mastery for the Psychology 1 students. 
Discussion proved superior in names known, 
for all classes; in attitude toward instructor, 
for the philosophy classes; and in intensive 
acquaintance, for the Philosophy 1A class. In 
allother regards the two methods showed no 
significant differences. (See Table IV) 

To express the results in terms of the course: 
the philosophy discussion students learned more 
classmates’ names and rated the instructor high- 
er than did the philosophy lecture students. The 
Philosophy 1A discussion students, in addition, 
established a greater number of intensive ac - 
quaintanceships than the 1A lecture students. 
Inallother regards, the results were equal 
for the lecture and discussion groups. The Psy- 
chology 1 lecture students learned more of the 
content of the course but knew fewer classmates’ 
names; in all other regards the results were 


equal. 


REFERENCES 


1. Asch, M. J. ‘‘Nondirective Teaching Psy- 
chology: An Experimental Study, " Psycho- 
logical Monographs, LXV (1951). 

2. Bell, H. M. Manual for the Adjustment In- 
ventory (Stanford, Calif.: Stanford Univer- 


sity Press). : 
3. Faw, V. “A Psychotherapeutic M ethod of 


"Teachi » i ychol- 
Teaching Psychology, American Psyc 
ogist, IV (April 1949), pp. 104-109. 


4. Johnson, P. O. Statistical Methods in Re- 
search (New York: Prentice-Hall, 1949). 
hnique for the Measure~ 


4. Likert, R. “A Tec 
"ment of Attitudes, ” Archives of Psychol- 
ogy, XXII (1932). 


# | 


6. McNemar Q. Psychological Statistics (New 
York: John Wiley and Sons, 1949). 


7. Mood, A. M. Introduction to the Theory of 
Statistics ( New York: McGraw-Hill Book 
Co., 1950). 

8. Moses, L. E. ‘‘Non-Parametric Statistics 
for Psychological Research, '" Psychologi- 
cal Bulletin, XLIX (March 1952), pp. 122- 
143. 

9. Murphy, G., and Spohn, H. An Introduction 
to Psychology (New York: Harper and Bro- 
thers, 1951). 

10. Pressey, S. L., and Hanna, D. C. “The 
Classasa Psycho-Sociological Unit," 


394 JOURNAL OF EXPERIMENTAL EDUCATION (Vol. XXII 


Journal of Psychology, XVI (January 1943), 


cations, and Theory (Boston: Houghton- 
pp. 13-19. Mifflin, 1951). 
11. Rogers, C. R. Counseling and Psychother- 13. Ruja, H. “A Student-Centered Instructor- 
apy (Boston: Houghton Mifflin Co. , 1942). Rating Scale, " Educational Administra- 
12. Rogers, C. R., and others, Client-Center- tion and Supervision, XXXIX (April 1953), 
ed Therapy: Its Current Practice, Impli- pp. 209-217. 


ERRATUM 


Mr. William H. Lucow wishes to correct a smaller- 
ror that appeared in his article, ‘‘Estimating Compon- 
ents of Variation in an Experimental Study of Learn- 
ing," published in the March 1954 Journal of Experi- 
mental Education, page 270, first column, second ex- 
pression. This expression should read: 


2 


a 
51 
55 


2 2 
(K- VK? - i)f “1 6 al (K+ VK? - 1) 
95 82 


